DOCUMENT RESUME 



ED 109. 143 ' 

♦ 

AUTHOR \ ' 
TITLE ' * v 
INSTITUTION*' 

PUB DATE 
NOTE ' 

AVAILABLE FROM 



EDBS PRICE „ 
DESCRIPTORS 



TM 004 .525 



Cohen, Monroe D., Ed. * 
Testing and Evaluation: New Views. 
Association for .Childhood Education Intfernat idnal, 
Washington, D.C. 
75 • 
6 8 p.. 

Association for Childhood ' Education International, 
3.615 Wisconsin Ave., N.W.^ Washington, D.C. 20016 
($2.50) ' 



ME-$0.76 PLUS POSTAGE. HC Not Availab 
Academic Achievement; Classroom Envir 
♦Educational Assessment ; Educational 
Educational Improvement; Eljementary E 
♦Evaluation Methods; Intellectual ' Dev 
Intelligence Tests; Learning; Partici 
Satisfaction; *Student EvaluatipirT St 
Motivation; Student Teacher Reflations 
Attitudes; -Teacher Role; Test Constru 
Test Reliability; *Test Validity 



le from EDRS. * 

pnment ; 

Environment ; 

ducation; 

elopm*ent; 

pafit 

udent 

hip; Teacher 
<^tion; *Testing; 



ABSTRACT 6 . 

/ % The methods and procedures of edusati 

schools must be. reexamined according to the' writers w 
contributed to this document. This? is most strongly f 
v of testing and evaluation for it is these areas that 
greatest impact on a child's educational career. This 
presented in three sections: (1) An Overview,'* (2) Tes 
and Possibilities, 'and (3) Some Examples of 'Meanif ul 
author has set forth what they see as inadequacies in 
and testing procedures of our educational systems. Th 
stude.nt and teacher are considered ^throughout the tfoc 



on in our 

ho have . * 

elt in the areas 

may have the 

document is 
ting Problems * 
Evaluation. Each 

the evaluation 
e role of 
ument. (DEP) 

i . 



* t Documents Required by ERIC include many informal unpublished * 

* materials not 1 available from other sources. ERIC makes every effort * 



* 



* to obtain the best qopy availab^eir^nesyetth^ie's^^ it.em^-^of marginal 

* reproducibility are often extcqunttefefl *an^ . 

* of the microfiche aad Vt£&copy xe prbj dictions ^BpiC^mfttoss'^Vatilatjile 

* via the E$IC Document Reproduction , Service ■ (EtfRS)-; E'DRS ^is;' #oV * 

* responsible for the quality oi? the original /document ♦ P/e productions * 

* supplied by EDRS are the best; that cah* tie r made from th0 .Origin ali % /* 

* * * * * afe * * * * * afe * * * * afe * * * * * * 5#eaft aie^pcTafe ?0c aft s|e afe afe j^'a|e^|e a|e aft |fc ^Oc afe ?0c aflc a|c a|e a|e ?0e afc afe a|c sflc^alc a|t 3|e afe a|c a|c )#e ^c,9|t a|e a«e 9|ta«t a|r a|c 



KM 

-4- 

r— « 
C7> 
O 



I 



festing and 
Evaluation: 



epVc 



PERMISSION TO^REP^ODUCE THIS 
COPYRIGHTED MATERIAL BY MICRO 
FICHE ONLY HAS BEEN GRANTED BY 



TO ERIC AND ORGANIZATIONS OPE RAT 
ING UNDER AGREEMENTS WITH THE NA 
TlONAL INSTITUTE O* EDUCATION 
V URTHf 8 REPRODUCTION OUTSIDE 
THE ERIC SYSTEM REQUIRES PERMIS 
SlON 0^ THE COPVRk«hT OWNER 



Q 



US OEPARTMENT OF HEALTH, 
EOUCATION A WELFARE t 
NATIONAL INSTITUTE OF 
EDUCATION 

THIS DOCUMENT HAS BEEN REPRO 
DUC E D EXACTUV AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN 
ATINGIT POINTS OF VIEW OR OPINIONS 
STATED OO NOT NECESSARILY REPRE 
SENT OF PIC I AL NATIONALIST I TUT E OF 
EOUCAT16N POSITIQN-OR POLICY 




ERIC^ . 



t r\n ^ mm r^u/\r\r^ tni ir ATinki ivtiok. ati/'Wiai 




> 

Testing and 

E\aluati6n: 
New Views 

*, 

Association for Chiidhood-J.ducation International 
M>1S Wi^onSin /Cvenue, N W • 
Washington, DC 20016 



Vito Perrone, Coordinator 
Monroe D Cohen, Editor 
. **l ucy Prete Martin, Assistant Editor 




* 

/ : 



Copyright © 1975, Association for Childhood Education International, 
3615 Wisconsin Avenue, N W , Washington, D C 20&16 • 

Library of Congress Cataloging m Pubhcatiop'toata 

Main entry under title A \ a " ' • 

Testing and evaluation / 
Bibliography pT>2. J 
1 Educational tests and measurements. 2 Reading — 

Ability testing I Perrone, Vrtb II Cohen, Monroe * < 

D , 1924- ed III Martin, Lucy Prete, ed« IV As- 

sociati6n fcjr Childhood Education International 

L83051 T44 ' 371 2'6< 74-34211 r » 

ISBN 0-67173-000-6 * * 

197476 fy&nmal Bulletin Order 

Assistance from the Rockefeller Brothers Fund irvthe planning and 
development of this publication is gratefullyacknowledged , 

Distrrbuted jointly by the Association for Childhood Education International, 
Washington, D C , and Citation Press, New Ybrk 

Design by McCurk & Associates * ^ 



Contents 



ERIC 



PART I Overview , 

A Introduction: A Time for Rethinking * 

V itOPvrrone, Dean (.enter toy leaching and Learning, University ot North Dakota. Grand Forks 

7 Alternative Ways in Educational Evaluation - 

\nnv \1 ftuss/s, Edward -\ Chittenden, And Marianne Amarel, Research Psychologists.Educational 
I est my Ser\ icef Princeton, N*ev\ Jersey t \ 

PART II Testing Problems and Possibilities 

13 What Tests Do and Don't Dp 

Susap Silverman Stodolsky, Associate Protestor ot Education and Human Development, 
Department of Education and Committee on Human Development, University of Chicago 

18 Understanding the Gobble-dy-gook: A People's Guide to Standardized Test 
Results and Statistics \ ; 

Michavl Quinn Patton, Postdqctoral Fellow in Evaluation Methodology a*nd Assistant Professor^ 
ot Sociology, Department ot Sociology, University of Minnesota, Minneapolis 

27 Standardized Testing: Reform Is»Not Enough! m *, 

George f Hem, Educational Consultant arid Writer, currently Coordinator, Independent St-udies 
* ' Program, Lesley College, Cambridge, Massachusetts t 

32 Another Look at What's Wrong with Reading Tests 

Deborah Meier. feacher-Director, Centra! Pack East, a mini-public school in District 4. 
Manhattan, and Consultant to City College Advisory 'Service to Open Corridors, New \ork City 

37 The Stranglehold of Norms on the' Individual Child . 

Lois Barclay Murphy, formerly v\ith Menmnger Clinic, Topeka, Kansas, currently Freelance 
Consultant and Writer in fields ot Child Development and Early Childhood Education, 
Washington, DC , * 

PART Ml Some Examples of Meaningful Evaluation 

43 The Prospect School: faking Account of Process 

Patricia f Carim, Director, Adjunct Services Program, The ProspeoX School, North Bennington. 
Vermont ' . i 

49 Marcy Open School: Feeding Back to Decision-Makers 

Ruth Anne Aldnch. Internal [valuator, Minneapolis Public Schools* Southeast Alternatives, 
assigned ~tcy Marcy Open Elementary School and to Marshall University High School Open 
M idd I V' Program 

52 'Children's Interviews * • ^ 

Nancy Ann*Miller, Member. Human Relations Cluster^Cent&r for Teaching and Learntng. 

University of North Dakota! Grand >Forks . . 

t » - 

58' Reflection in Teaching * . • # 

Anne M tiussis and Edward A ChKtenden. Research Psychologists. Educational Testing Service. 
Princeton. New jersey 1 -~ ^ 

62 Selected Bibliography 

O firenda S Engel. Consultant in Open Education. Cambridge, Massachusetts 



1 ) 



Rartl 

Overview 



Introduction: 

ATime 

for 

Rethinking 



In November 1972 educators from several parts of the United States met at the 
University of North Dakota, to discuss some common concerns about the narrow 
accountability ethos that had begun to dominate schools and to share what many 
believed to be more sensible means o^both documenting and assessing children's 
learning Subsequent meetings, much sharing of evaluation information, and 
financial and moral support from the Rockefeller Brothers Fund have all con- 
tributed to keeping together what is now called the North Dakota Study Group on- 
Evaluation A major goal of the" Study; Group, beyond 'support for individual 
participants and programs, is to provide materials for teachers, parents, school 
administrators* and governmentaTdecision-makers (within State Education 
Agencies and the U S Office of Education) that might encourage reexamiaation of 
a range Qf evaluation issues and perspectives about schools and schooling Egjtiog 
and Evaluation. New Views represents one effort in this direction 1 
4 What^will be^clear to readers of this bulletin is that the individual writers are 
deeply involved fn schools This accounts for their intensity They view evaluation/ 
testing issues as particularly important because they know these issues seriously 
affect children, parents and teachers, oftentimes adversely All our writers v can 
provide many examples ?)f -children who, as .a result of standardized testing 
programs, were labeled negatively #s learners From firsthand experience our 
writers know of classrooms where normative testing and other nafrow evaluation 
measures limited the quality of human relationships, whefe teachers felt forced to 
effect a segmented curriculum with children engaged much of the d^y completing 
paper-pencil "skill" worksheets in-order to '/improve" test scores,* where the 
potential for children's growth as learners was minimal because so much time was 
spent on what was most measurable, not pn what was' most meaningful It will also 
be clear that the writers believe strongly in evaluation, viewiQg it as a critical 
process for growth, not only for individual children and teachers but for 
educational programs as a whole x , 

The contributors share an open-cla$sroom orientation — interactive* rather than 
behaviorist — toward education THey believe that learning is a personal matter 
andjias an intentional quality,, that it varies for ^different children, that it proceeds 
'best when children are acjtively engaged,, that it takes place in a variety of environ- 
ments in and out of the school and is enhanced in a supportive 'setting where 

. > • : % ~ . c . 



IC 



children ^re taken seriously They favor an integrated curriculum and an active 
decision-making role for teachecs' . \ 

The bulletin is organized mtOathree sections (1) An* OveVview, (2) Testing 
Problems land'Possibiltfies, and^) Some Examples .of Meaningful Evaluation *ln 
Part One, Anne Bussis, Edward^Chittenden and Marianne Amarel set forth what 
they conceive as the inadequacies of experunental evaluation procedures for 
programs concerned with "considerations of process, content and context," Experi- 
mental proceduresjhat stress standardization of treatment, behbvioraj outcomes 
and .quantitative data analysis tend to dominate educational evaluation, By 
x>ver-rehariGe on them we may, as^James MacDonald hatfpote'd, "Reduce all signifi- 
cant schooL related behavior to performative acts. In the> process, we will say in 
effect that what takes plaice internally is either illusory or irrelevant to our 
concern " 2 » v ' . | 

There is an alternative with a"tradition predating the experimental jriode and; 
trom my perspective, more appropriate to m^ny of the directions being established 
in open, ehild-centered edu^cationah programs Bussis,, Chittenden and Amarel 
describe that alternative , - I 

In Part>Two, the focus is on 7 testing Although we* would not w^int readers to 
believe thateva/uat/bn means testing, we realize that such a view is growing (How 
does one evaluate thj? progress of a first-grade classroom? Give th£ Metropolitan 
Achievement Test, of ct)ur<$e ! ) Susan Stodolsky provides a balanced overview of 
"What Tests Do and Don't Do " While she-does not argue that testing ha$ no place 
iCi schools, she does suggest that "as we move from teacher-centered to child- 
centered classrooms, from group instruction to individualized instruction, from a 
fixed to a moreftuid curriculum, "the whole enterprise of testing musfbe reoriented 
and reas's^ssec}." * • * p ' tl * 

Michael Patton extends Stodolsky's discussion by providing a guide to the statis- 
tical nomendature accompanying testing Does everyogeus'mg a standardized test 
understand how ^'grade-level equivalency scores" are derived?'Or the meaning of 
"reliability" and "vajrfaity"? (This past summer, in a graduate seminar relating to 
evaluation methodology, an experienced school-principal, after reading a state- 
ment that "grade-level is simply the middle score — half must always be above and' 
half below— askedfls that really true?") Patton notes that the testing Industry is 
booming in America "Production of new t^sts is occurring so, rapidly that even 
specialists appear to be overwhelmed." ButOiow appropriate are the products? Are. 
the tests ^free of serious .error, bias and invalidity? Do they provide better inf&r- 
mation than teachers can gam through personal* observation and 'interaction wLth 
children? 

How would test-makers respond to the foregoing? I suspect they woufd agree 
that th£re are some problems, that there is room for improvement As George Hein 
indicates, "The^qu^stions could be betfer, the standardization could be based upon 
more representative samples of the population, the tests could be validated against 
criteria more appropriate than the ones used More* imaginative use of the 
available technolbgy Could vastly improve even paper-and-oentfl machine-graded 
examinations. A much broader range of activities could be standardized." But 
Heirf also argues that "Reform Is Not Enough " He asks us to examine the politics of 
testing, the role testing plays in "sorting] and classifypng] children for their 
assigned roles m society 4 ' Thomas Cottle, who is with the. Children's Defense 
League, points out in Social Pojicy that "tracking," one outcome &t testing in many 

c , . * 

} The annotated bibliography prepared by Brenda Engel Ipp 62-64} includes material written by members of the Study 
Croup larly in 10*5 d series of -additional evaluation Ponographs wilL be available Selected titles are Observaf/on and 
Descriptor) An Alternative Methodology for the Investigation of Human Phenomena by Patn£ia Canm. Alternative 
Evaluation Research Paradigms by Michael Patron, An Open Education Perspective on*ivaluation by George Hem, and A 
Handbook on Documentation by Hrenda Engel for further information about the series, wntq to Vito Perrone, Center for 
Teaching & Lq«irniMi, University of North Dakota. Grand Forks. ND 58201 
J ps MacDonald, An Evaluation of f valuation/' The Man Review 741974) 9 

/ ,- 



school districts, is "masked by ration'al educational theory a iV political impli- 
cations overlooked by some who think of it merely, as ^Traevitable conse- 
quence ot htiman dftteretaces " * _ % 9 

Deborah Meier, in a question that relates tlosely to some of those raised' by 
Patton and Hem, asks, "What do we mean bv reading competence?" To observe in 
classrooms the mass of skill sheets aimed ^at auditory discrimination, Wending and 
syllabication, among others, one would have to conclude that reading is a skill 
(Meier* suggests "a trick") that, for the most pa,rt, yfjearned m isolation .As Meier 
documents, little evidence is available" to show thatyjtich activities will improve a 
child's comprehension -^he capacity for turning ^"written page into something 
that mpkes sense Meiei^rgues convincingly that standardized reading tests con- 
tribute heavilv to the concentration on the skill sheets, tending in the process to 
distort the meaning- of reading Because she feels that evaluation is important, 
Mek* outlines some alternative means of assessing reacfrng 

«4.ois Barclay Murphy provides a synthesis to Part Two by bringing to readers' 
attention some of the research relating to a range of important school issues, for 
example, teacher judgment, children's intellectual growth and social development, 
and motivation The^close of Murphy's sensitive statement provides an excellent 
transition to the final section of this bulletin, "Some Examples of Meaningful 
^valuation," where qualitative data is a basic concern of the writers ^ 

Patricia Carini opens Part Three by discussing the documentation activity at the 
Prospect Schopl of 'Bennington, Vermont "Taking Account of Process", is a 
"concept and a practice more of us ought to learn It suggests a careful, systematic,, 
documentation which assumes that learning ynnot b^ "recorded and assessed as 
.isolated elements .independently of the meaning for the learner " In order to 
capture this broader view of learning,, the processes underlying children'^ 
(:iirections"'must be observed over time to determine a pattern, a matrix of descnp*- 
tions ot the learners involvement " Carim's examples, t^aken from the Prospect 
School's documentation, are explicit, and illuminating In introducing some of 
Carim's rriethodol6gy to teachers, I have often had the response, "You can't really 
expect us to do-all of that'" I am convinced that significant, evaluation that "takes 
account" of children's growth and the/'standards that exist in a school" demands a 
level of intensity 'not yet characteristic of most schools 

The way the Marcy'Open School, Minneapolis/Minnesota, uses evaluation is 
outlined next by/Ruth Anne Aldrich. Participants at Marcy raised the questions, 
"f brwhat must a sch6ol be held accountable?" and "How Can evaluation provide 
us with the information we need to develop an increasingly responsible pr6gram?" 
Attempting to respond to such questions has brought about a particular style of 
evaluation and a conclusion that "schools should be growing, evolving institutions 
aware «of their successes and designing change for their failures" How Ynany 
schools have attempted to organize such a process of internal evaluation? My 
experience suggests that it.happens rarely Perhaps this lack accounts, in part, for 
the narrow evaluation patterns that exist ^ 

What could we learn if we listened to children? .Nancy Miller describes her work 
with the Children's Interview which focuses on children's roles m« the classnoonfi ' 
and their contribution to their own learning, children's perceptions of the teacher's 
role and their relationship to the teacher, thp contribution of classroom peerifoter- 
action* to children's learning, and the children's view of the classroom ds an&veraU 
learning environment and the ways'tn which they relate t6 that environment 
Miller, and others of us who have worked with the Children's Interview, Jrave found 
it a powerful process for taking account of children and 'their learnmfij \ 

Another source for qualitative data on what is happening in a sd$5ol or class- 



' What Tracking Did to Ollie Tavlor, Social Policy (July- August 1<)74j 21 Readers migrit aho^nofc the NAA( P^and 
ibe Mexu an Anient an Ueten.se league are eniiayed in severUf litigations in California that r«Mat<#fo the social effects, of 



. • • - . : <* 

t. 

y room is the ? teacher Anne Bussis and Edward Chittenden's summary of their inten- 
sive Teat ner\l ntervievv Study is, in large measure, an extension of their 

( • methodology article which is included in Part One "^n what ways do Jeachers 
. $$hk about teaching? How do they conceive pt the comp!ex*pattern oi events that 
/ mark the schooklay' What assumptions do they hold about learning and develop- 
ment^ What are the grounds for their planning, provisioning and evaluation?" 
Taking account of the teacher's "personal construct". is a process that is rarely a 
part ot 4 w hat schools vieu as evaluation But not to take account of such a process 
is to tail to understand the central role of the teacher in most classroom settings 
\\'e close wit^ a highly selective annotated bibliography prepared 'by Brenda 
Engel. Those ot us who have been struggling wvth the issues that dominate this 
bulletin are aware that much of the important literature is in unpublished forn\and 
hence not readily available Engel has included only accessible materials with the 
potential tor extending our discussions ' . ■ 

At the outset ot this introduction, I commented that one of our major purposes is 
to encourage a rethinking of a range of evaluation issues We hope that this 
bulletin 'is helpful to such a process We invite your reaction 

Vito Perrone 

December 1974 ' ' 



- < < 



4 ' 

Alternative 

Ways in 

Educational 

Evaluation 

Anne M. Bussis, Edward A. Chittenden, 
and Marianne Amarel 



( 



ERIC 



Elementary educators arecaughjt up m.debate over assumptions about ways that 
children learn best, about the teacher's role in^curricular and instructional 
decision-making, and about the networks of human relationships that constituted a 
sclyol Although the development of approaches for studying and evaluating 
educational programs is an old problem, these current debates have served to high- 
light the inadequacies of existing ways of categorizing and thinking about 
educational variation ..pur purpose here is to describe problems in educational 
evaluation as w^fave come to identify therja/in the course of our studies with 
9 achefs; chilclr<*ff and schools — ancfcro suggest some alternative direction^ 



The Problem of Educational Models 

A look at the total field of elementary education suggests that the present degree 
or experimentation and'varij^ among program? Snd approaches i*s greater than in 
thle past 'Much of this experimentation .has mvofved only ^urface change of a 
* somewhat gimmicky nature, bi/t change\in some schools appears, much more 
substantial, involving tH? development of new understandings regarding learning 
and teaching The literature about change in early.^education, however, has tended 
to blur rather than -clarify the issues underlying^ these variatfons Part of the 
difficulty is that the literature has cast variation predominately into the language 
ot models " The model concept would seem an efficient way of describing the 
basic components of different programs, but m\reality it has prov^ to be 
something of a trap for jeducator and researcher alike. 

Educational "models" have two clear qualities First, they contain accounts oj, 
or prescriptions for, methods — for ways of'doing the model Second, they contain' 
statements about intended outcomes — objectives to l?e accomplished Frequently 
both methods^nd outcomes are stated in conci/ete behavioral terms Models differ 
extensively in the kinds of materials and methods they prescribe, in the settings for 
which they are intended, in their degree of prescriptiveness, and in the scope of 
their ambition — i e , some models are* for* instruction in selected areas only,, 
whereas others are for.mstructiori more broadly conceived Some early education 
models*are designed along behavior modification lme& Some are'based on child 
development theory (e g , a Piaget curriculum), .some Sire models of components of 
the Brjhsh "integrated day " <• , ^ 

\ Inasmuch as model thinking is very much a part of mqst research and Ifcaluation 
schemes, at the very outset the teachers participating in such evaluation have to 
/contend with a set of rules tha\ may not have been clearly explicated For example, 
they soqh may be asked to spfecify exaqt teaching methods and desired student 
outcomes, or otherwise firtd themselves responding to a request to delineate the 
criteria of a "model, c^sroom " To the extent that the teachers go along with, or 
cannot escape, th'e^essure to operate in this manner, they do indeed try to 
implement the modef— whatever the^ perceive it to be^They may try to dispense 
reinforcement^ in order to "modrfy behavior,/' run twenty,-five tutorials ipprder to 
"individualize instruction," or set up interest corners in order to have ( a:n "open 
classroom " The defining criteria of methods and outcomes they have previously 
specified then become the yardstick of success — the yardstick by which they and 

Standards of Quality , " ' * 

In contrast to this specification of "method" and "outcome" that is associated 
•with a "model" approach in evaluation, magy educators today are emphasizing the 
development of standards of quality in learning considerations o^process, content 
and context Such standards represent educational and psychological i:onstfucts 
more than behavioral criteria They are^ a frameof reference from which the 
teache/ works and evaluates what has been accomplished, but they are not 
prescriptions of methods "to be , followed" or outcomes "to be obtained "* 
Standards, m the sense of constructs or frames of reference, are neither directly 
"followable" or '.'obtainable," since a construct (by definition) is^of a more abstract 
nature It is a principle, an understanding derived through intellectual synthesis, 
that underlies the teacher's consideration of any particular procedure or learning ' 
Educators working within such a fram^^k haye as an objective the creatioiyof 



the^evaluator measure "their efforts j 



conditions thatpromote^guality learnrng, but they do npt and cannot have strictly 
prescribed methods fqi^hieving that objective— and th^y, do not and will not 
have a limited, narfow s'et of behaviors in mind as the onjy guide' for judging 
children's progress Gleaming Another difference worth noting is that standards . 
cannbt be stated exhaustively beforehand By ttielr very nature, standards become • 
.better articulated over time , and with experience, but they necessaV^-ce^ah^^ 
*crp&n constructs/' capable of absorbing new and unpredicted examples^wiihnn 
' the^Mr definition • ' • , k 

Toward Clarifying the f?ractitibnerVFrame of Reference % 

Re'seafch on educational models (e g , Head Start, follow Throughj.suggests tljat,'^ 
teacher differehces within educational programs is as /great as or greater than 
vafiatrons between programs Although th£ probl'em has-been recognized, the 
apparent solution that is attempted in conducting evaluation is to try v to .obtain 
greater clarity of program, descriptions and to define criteria in 'the hope that 
teacher variability (within the given model)' can be greatly reduced m 

The problem as we see ij, however, is not so much one of trying to define pre- 
cisely any particular educattonal program, But of defining "those characteristics • 
that reflect program variations a* the level of^ teacher un^rst^andmgs and 
perceptions The following quotation from a study of preschool programs 
illustrates'what we mean (DiLorenzo' et ah 1969)' * t * 

it was the ^a/ve asspmptjon'of the research staff responsible forfhe desigr\of . 
the study that prekindergarten.programs for the disadvantaged existed in packages^ 
to be picked off the shelves m the educational n&fket'place. Once the districts had" 
madejheir choices,, the program treatm'ents K \\duld be inserted into the design 

Distmct l prpgrams did nol exist Points of view , . did*, and they determined the 
typ^of program which evp/y^. ' " ' 9 

Analysis of the data from* our teacher interview sUidy thus f<ir indicates that 
programs and .classrooms that superficially seem to share comnion elements may 
be very different in intent and. emphasis To distinguish between them, one needs 
. to identify tjie frasic 6r '*prototypic'\ notions that predominate*^' the teacher's 
* referents for instructional activity What are his or her Iearnmg.pi1oritle$— beliefs 
about what .children should learn to ^caxe" about? What are the teacher's 
organisms — are fhey seen as constructors oj reality or receptacles for knowledge, 
(^SOme combination of both? What are his^sujnptions about the organization of 
^knowledge and Viow, it is best leaped? What are perceptions about the use and 
potential .value of material resources in the classroom? The Snswers to these and 
"similar questions determine the nature of the "construct systejns that guide 
teaching behavior 1 Tentatu/ely^at least, we h^fve identified what seem to be quite 
different construct systems^all frpm a population of teachers working under the 
heading of "open education "* v * 1 " 

Although the classrooms of two different teachers might lopl* rather alike at a • 
giv^n point in time, the notion of persQnal construct systems implies that they may 
be headed in quite different directions and thaMhe teachers may, expect different 
kinds of environmental support "Such a cogceptuaIi^ation*-one dealing- with the 
practitioner's frame of reference— has important implications for evaluation and 
s. research, as well.as practical value in helping to sort out the activities now going 
•oh under numerous lalWls . ♦ > 



»>< Pit *>H ■*» 1 <>t this publication 



Toward Clarifying the Research/Evaluation. Frame of Reference 

Substituting the practitioner's .frame of reference in place of precise specifi- 
cations for educational program's is (>ne step in a different* direction, but it js not 
sufficient in and of itself We alsoju^ecl to look at the research evaluation model 

One basic problem vvitl^the^^spventionafresearch evaluation approach \% that 
"student outcomes" are gene>a2y*detined by behavioral jcn^na and are therefore* 
"statements about learning necessarily lifted out«ot the context of the total activity 
as it actually occurs fn the classroom". Thus', an example* .the outcome-statement 
'^works independently op a project" does not v take into «rt>count (hue purpose or 
quality of the particular project in question, whether the pityect is more appropri- 
ately a group or an individual endeavor, and so on This difficulty is the same one 
mentioned previouslv in our discussion of "model thinking," where we contended 
that some educators are now emphasizing standards of qualjtv jn learning Thus, 
thev evaluate teaching in terms of educational and psyc hologicaf construe th and 
jiot in terms of but-of-contaxt behavioral cntena * , : t .'j/ 

To illustrate the idea of standards a bit more, standards bt quality in tm? process 
of learning wouM include such factors as originality of work (the notion of 1 author- 
/h*p"), purposeftil effort, mdepekfence of effort A concern with quality in' the 
content of Iearnmg~would mclucfc consideration of what children produce (eg., 
writing/drawing), evidence that instruction deals with "powerful" 2 concepts (e g , 
graphing) as well a$ necessary skills (addition/subtraction). Standards relating to 
the context of leamm^center on desired qualities in the nature of human relation- 
ships (child child as well as child/adult) — openness and honesty oKencounters, 
respjgpt for the efforts and feelings of others HovvevVc ^standards are thought about 
and described, the important 'point is that different ^<inds be cgnsidered and 
applied m -evaluating a particular learning situation * 

These standards that provide an evaluative framework for many teachers cou[d, 
wjth c larif rcaftion, sf?rve to broaden the general approach to formal evaluation, 
because they suggest relevant evaluation data other than student behavior talsien 
out of context * * l *-* 

Another problem with .the conventional research/evaluation approach* \t (he 
implicit assumption that the educational 'treatment causes pr^du^s' the 
objective or outcome Rather than stating objectives for* chilcfrefj thatcri&y be 
attained as a result Of certain methods or treatments, however,' many educators 
prefer to state assumptions about children's capabilities that ma\ be 'realized,' in 
certaiQ facilitating environments They assume, for example, that all children are 
capable of displaying intelligent effort, responsibility" concern for others, respect 
for self— given an instructional environment that elicits and supports such 
behavior Such capacities are not thought of as Instilled or caused by a-precon- 
v ceive"d educational treatment but* rather as drawn out and encpfjraged by a 
responsive and flexible educational program » < , i 

The difference between stating objectives for children and assumptions about 
children's capabilities and resources may seerp minor at first, but it has 
far-reaching implications. It is a difference that leads. to (a] a concern with envi- 
ronments rather than treatments,, (b) an emphasis on response variability among 
teachers rather than response uniformity, and (?} a focus-on standards of quality in 
learning rather than behavforal criteria outside the contexfof purposeful action If 
research is to accommo^l^e -these* priorities now being held by many educators, an 



^<"t* ' IdVHI s { 1970) discussion of power' as a dimension of concepts 



overhauling of our basic paradigfn seems called for As a step in thi$ direction, we 
suggest the following- w ^ ' . 



^* Assumptions about 
children's res'oOrces 

(capability statements) 



Focus on 
' . facilitating* * 
environments 

Emphasis on 
"opening up" response 
repertoires and increas* 
• ing teacher variability 



Evaluation evidence 40 
in terms of standards ( 
of-tyuahty 'applicable 
to a wide variety of 
student/teacher 
behavior, as well 
as to aspects of the 
physical environment' 



— > as an alternative to 



Objective^ for 
children to att&in 

(behavioral statements) 



Focus on 
educational treatments 
or methods 

Emphasis on , 
standardizing response 
repertoires arid decreas 
ing response variability' 5 



^Evaluation evidence 
in terms of specific 
oehavioral criteria — 
i e /similar behav- 
ioral expressions 
by all children 
apd teachers 



11. 



Research and evaluation along the lines suggested in'the diagram can draw upon 

a tradition within psychology that has emphasized the stud/ of inner states such as 

. t belief systems, attifudes and understating! Ir> the United States^hi^ tradition is 

represented in the writings of Kelly (1955), Sriygg and Combs (1949], as well as 

other's With a few notable exceptions, 'fhi's "phenornQnologicaT tradition has not 

been recognized Within educational research Instead the field has been 

dominated by the conventions *o{ testipg*ahd measurement and by behavioral 

psychology* At the level of instrumentation' new approaches that fit with the 

phenomertological tradition could well include interviews, documentation of 

environments through observation, the systematic collection of work and language 

'samples. . » f ' 

It is one thing to analyze issues and point to directions that' might alleviate some 

problems It is pbvrously'qgite another matter to^tran^ate ideas into reality 

Nonetheless we feel fair.ly'confident that advances can be made Many peopje 

have, already made progress in devisinrg more appropriate ways of assessing 

« children, teachers and educational environments The problefnsl ^nd questions 

* raised here are complex But if they are not addressed, we face the re^l possibility 

th r at a good deal ol substantial progress in.ed'ucational thinking and practice will go 

Vdctwn the dram — because it is judged to be "not.ver.y effective" on the basis of 

iriappropnatevcriteria 
» ■ 

"Objectivity" and Decision-Making • 1 ' " 3 

A final problem in much of the current thinking about educational evaluation is 
• that.it is assumed to be an "qbject;ive techndlogy" That behavioral science 
operates in an objective (m.the sense of value-free) fashiqn or that evaluation leads 
© objective (value-free) decisionmiakmg'are myths that have been too long with 



us and far too widely perpetuated The latter myth is particularly destructive to the 
3egree that people in education actually believe it, which many apparently do 
Decision-making is invariably a subjective, human activity involving value 
judgments (or weights) placed on whatever evidence is availabl'e.to the decision- 
maker Depending on the extent to which parties to a decision agr^e that the avail- 
able' evidence has been imparti^ly gathered and represents /important" 
information, people may or may not agree on the meaning of the'evidence Even 
when there is virtual consensus on "the facts of the matter/' such facts-do not 
automatically lead to decisions regarding future action People make decisions, 
information does not \ ~ ^ * « ? 

# Biologist Rene Dut^os, in describing the diverse reaction of fallow scientists to 
his book Only One Earth The Care and-Mamtenance of a Small Planet, provides an 
instructive example of the human reality of decision-making (1972, p 508) 

Starting jrom the same set of scientific facts,' the experts arrived at amultiplicity 
of contacting conclusions wjfb regard to the practical policies concerning the 
en\irQf>tnent — policies, for example, about nuclear energy, pesticides,, further 
"adustrialization of the world,, et cetera., Their conflicts originated not from 
P&fferences m knowledge or interpretation of facts,, but from differences in the 
'Va/ue judgments they put on these facts In this regard,, experts. display as much 
diversity as nations and individual persons,, they differ not only mlkeir approach 
to social and human goals, but even more in the selection ofjhese goals. * 

Although it is understandable that the term "evaluation" rmght graduajly-cprne 
tobe applied to the activity of gathering information prior to 4ecision-niakmg," "it 1$. 
not at all clear why the human activity of actuaffy/'e\£luating" the information has 
been so left out of the publicized pictureTjfjhe' valyes that di£tate^edu&atioRal 
decisions remain uriexplicated — if,, by default, they are the Im^it values built 
into the information-gathering instruments*— then w£ are indeed settling for more 
or less impersonal decisions", but they are hardly "objective decisions." Perhaps 
we should use terminology such as assessment and analysis for information- 
gathering activity — and reserve the term evaluation specificajly for decisions 
made^bout the information , 

Concluding Remarks ' <* 

In summary, we have proposed that the asswmptions inherent in the notions of 
educational treatment and behavioral outcomes ares/basic issues that need to be 
readdressed along yvith problem?of instrumentatic^-and data interpretation. While 
alternative models of educational evaluation do nokas yet exist on any broadly 
accepted basis, the range of admissible techniques Ynd strategies has broadened, 
and in some places, parenft, school board -members, admintsttators, and state and 
federal officials have supported alternative forms. Hence the need for fhe kinds of 
accounts described later in this publication — examples that can be looked to for 
guidance, 

References « ' , 

Di Lorenzo, L T , ftT'Salter, & J J Brady Prekmdergarten Programs hr Educationally Disadvantaged 
Children Washington, DC U S Office of Education, Department of Health, Education and Welfare, 
19,69 r > * . 

Dubos.R "JThe Despairing Optimist" The American Scholar 41, 4 (Autumn 1J#2) 508-12 Copywght © 
197-2 by tb^Onited Chapters of Phi Beta Kappa By permission of the publishers 

Flavell, ) KJ, /Concept Development " In Carmichaels's Manual of Child Psychology (3c/ ed ), Vol I, 
Paul H/Mussen, ed New York John Wiley, 1970 t 

Kelly, C The Psychology of Personal Constructs, Vol 1 New York W W Norton? 195 

Snygg, D , & A W Combs, Individual Beha w^Jg|gv Ed ) New York Harper & Brothc/rs, 1959 

* ■ - * . 



Rartll 



Testing: 

Problems 
and 

Possibilities 



What Tests 
DO and 
D6n'tDo 

-Susan Silverman Stodol^ky" 

Tests come in rpany shapes 'and sizes Test constructprs have produced/ 
instruments tor use in measuring a wide .array 6f human characteristics (Buros, 
,1971, Johnson and Bommaritp, 1971) Most tests children take while in school $r& 
teacher-made, that is" designed by their own teachers. Others are provided by 
textbook publishers In addition, a child in elementary school may be administered 
a group intelligence test, possibly some aptitude or interest measures, and a 
number of standardized achievement tests *As George Webei; notes in his 
pamphlet, Uses and Abuses of Standardized Testing /n the Schools (1974). 

Some standardized teits do not do a good job of what they claijri to do,, and for 
some testing purposes non-standardized t^sts are more appropriate or more 
efficient. But standardized tests are used by all our public schools. Important and 
evervcr/t/ca/ conclusions and decisions are made on the faas/s. bf.{heir.results. For 
example,, on the basis of their results the pubjic is told that readjng*achievemerit is 
. Agoing up or going (down, an experimental program, is fedme^ .\ucce$stu}^r 
unsuccessful^ child is placed in thi$ or f that class t \apd'studenls gain aarqission toi x 
particular college or fail to do so ('p'pl 7-2). « - )\) - ^ ^\ 

Standardized test scores have been^shawn 'to pfey,'a cryci$K anS^ often 
unwarranted role in determining further schoolirig jahd.teacfier attitudes ^axjd in 
affecting puprJ/./elf-Concepts for most of us, testing has bejbomg an ^xpecte^jf v 
y n9t a^ce^d'p.a^pt schooling : J \ i { ' £** ' i < - • *' C^?- 

Teasers, admjnistrators apd p^renfc ought to determine the appropriate rojes^ 
' tor testing and judge the utility; of testing in fostering the. heplthy growth and. 
development of children In this/paperj will try tq prgsgnt an Overview of the n^IcT^ 
of testing, incluc^ngways \\)h field itself is trying to change^ f >vjfl c^i^cus^ testing in v 
\ ter&$ of (>) purposes for giving tests, (2) effectiveness Of different typ^s.of^tests in 
. ' piovjdmg needed i ritormatiort for, a given purpose, ,{3) cJet^rms^atiqn of who is 
benefited ,by tyeiest results or testing experiences (4) relat^i6n$|)ip between the 
, / mstjti^tiqnal process and testing procedures,, and (5) kinds, .of skrl IsJ, learning and 
; grov3f±l^^bga^se5s.ed. thes£ issues s^em.to me to be cental if .a! teacher wapts to 




, \ \ hope, J^^WTan"swe$.:q^estipns like the following yynat are the various 
~[ / puj-pose^lo/yvhj^^ can.be constructed? vVh&t kinds .of tests are 

or tould be ^ailajb|e^)$Hat kitys pf information pan tests pxovicle? Wh^t are the 
major vaiuW#nd-ni^ flf testing? 



Q Author ts jjMtfful ^),7^rfMyU*ijof*a$sis^*nj« m Jearchtpg4h&, literature and for he^iftyt sig^tioos 

eric • -:-!*^m\ "Vi^i ' 'w,-; 



For purposes of this paper, when I refer to a test obtesting, I mean a Systematic 
and deliberate, way of sampling a student's behavior or thmkrng* Ordinarily we 

* think of a test as a paper-and-pencil device, but many oth£r kinds of evidence- 
gathering procedures ar£ available to the teacher, and researcher 

To simpl ity discussion of the purposes of testing and types of tests, I will restrict 
my attention to evaluation of academic achievement \ p# % 

Recently a number of important distinctions hav§ been made regarcfmg possible 
function* of testing and ways tests ,are constructed, scored* ahd interpreted. 
Historically tests have been administered mainly at the beginning of some* learning 
experience (purpose prediction or selection) or at the end of a le&rmn^ experience 
(purpose grading or classification) / ' 

There ar% situations in which using tests for prediction, selection and 
classification are justified It is important to recognize, however, that sucb % usages 
assume'that the success or failure of a Ai'ld in school is a function of the .child's 
initial* Characteristics, and that the educational environment or progjam is virtually 
fix.ed Wheiji standardized tests hav^ been successful in predicting furthenaehieve- 
ment of students, instruction in the schools jias usually persisted in sorting. and 
ranking students in much the .way Jthey^vvoulcf be ranked an standardized .tests. 
Even so, standardized tests a^usu^ ,besT a\ predicting future .performance^ 
similar instruments, nof in pred;ctifi%'suc\es$ ip the life activities thatmight be 
associated 'with the area o^rachievement m§afOred. ✓ " % 

* Ordinarily the purpos£s r af predictidp, selection and classification ( are best 
served by norm-referencectfests^whrch rafik order individuals or groups Michael 
Patton's article which follows describes the°meaning (as well as some, of 'the 

- problems^ of nprm-ref ere need tests. Vtere, it is important to keep in mind that 
standardized achievement tests of *ths type are constructed by attempting to 
santpfe som'e domain of subnet-matter content and learning processes that 
represent the objectives of Curriculum or group of curricula in use in the >chools. 
Norm-referenced tests caffcfeljabjy estimate the relative standing of a^ftild with 
respect to the area/measi/red By the very nature of the way they arexonstructed, 
htiwever, these tests cannot provide the child, his teachers or parenfs information 
about what he N h& specifically learned or not learned in a given subject matter over 
a given period of time Since standardized tests are only saplings of course, 
material, score* cannot be used to determine explicit instructional needs of, 
children i£i any but the most global way. 2 X 

. ' y & 

Formative and Summative Testing S 

Closely related to the purposes served by^tandardized achievement tests is the 
idea of s&nmative testing, which is concerned with product measurement. A 
t summative test is constructed in a manner similar to a standardized, achievement 
test rn that the questions are sampled from the course objectives and contents. 
Since the purpose*of a lummative test is usually grading, these instruments are not 
designed to provide detailed feedback to the student. Summative tests ask, "How 

Vor useful discussions of mtell/gence tests see Anastasi, 1%1, and Kaye, 1973 

. — jl r ^ : . — . jas 

- Slant Lmli/Ml .it hu'vement^esis are often used m i ompanng one edui allonal program with another Norm referent ed tests 
l).K4 \n » niM (I «is k» v m» a.iur<* in most large si ale evaluations of intervention programs su( h as Head Start. Follow Through 
"id I I I he u\< ol ^taftdardi/ed tests for ( omparative evaluation studies may be appropriate it the programs studied are 
irtiiu. In ai h t# v«> tK«> ub|/(t»vf»s measured in the test, but this is not (often the, case (StodohtVy, 1072) Usually when 
iioiin m h ri m rd tests are usr/d in t omparative evaluations, (hey are better articulated with some p/ofcrams than with others 
\Uu i litldn n in detain educational programs have more familiarity with test-like situations and are more able and willing 
lni»ui(Wo on demand iSha/mo V)7 \, ( hittenden and Bussis, 1072) At best, results from the administration/)! standardized 
thnvMiunl lysis give usii heavily confounded estimate of the actual academic achievement of children in different 

4<6 



i 



« well has the*student metered the material and processes involved in the learning 
upits The has just studied?" In many educational contexts it is still believed desirable 
to inform a student ^boot h£ prpj»ress relative to the expectations of a course Final 
exams constructed by teachers are the most common variety of summative teaf s N 
The term formative testing has two gses in the' literature, both, concerned wnth 
providing feedback about the learning process In one usage (Scriven, 1967), 4 it is 
testing to gather evidence while a curriculum is^being developed, so that 
curriculum writers can improve materials and procedures In the other usage 
" (Bfoom, Hastmgs^nd Madaus, 1971), formative testing is used as students go 
through learning~un&s in c>rder 'to provide feedback t6 students arid teachers about* 
their progress and to suggest areas in which adch^ional, learning ancfpfactice are 
necessary Formative tests are part of a trend to^useteststo provide meaningful 
feedback about studen.t learning and growth 

As defined by Bloom, formative tests are often used in conjunction with a 
mastery-learning strategy in which it is* expected that virtually all students, can 
master the unit being studied with sufficient tjme and learning aids (Block, 1971) 
In this context, students tak# formative tests, when they have completed initial 
study of a learning unit SincS formative testing is an integral part of. the 
instructional process-, these tests a^e very^ different from norm-referenced tests. 
Formative tests cover a relatively narrow*range of topics and need fc provide 
sufficient information so that *a student can determine future steps in learning 
based on the fest results For exarpple, a formative test dealing with a unit on 
learning long division would contain a number of items that would pinpoint the 
steps students had and had not mastered, whereas a summative test would not 
include such items but only long division problems ' 4 

The use of formative'testmg for student feedback does not depend on adopting a 
mastery-learning strategy It does require a decision on the. part of the teacher to 
use tests as instructional tools or aids. -Similarly, when formative tests are use'd in 
the curriculum development process, their chief purpose is to provide detailed, 
feedback abogt the curriculum arid its effectiveness so that weaknesses can ber 
improved * ' 0 

Criterion-Referenced Testing * 

Another recent distinction is th&t between criterion-referenced and norm-refer- 
eocecLi£Sts_JYe_ have already discussed norm->eferenced_ tests, whose major 
purpose is tQ allow cine to interpret scores with respVt to the relative standing of 
an individual or group. Criterion-referehceci t^sts have been developed in order to' 
provide information about student performance that has been cfiffiqult to obtain 
*.from standardized tests. "A criterion-referenced test is one that is deliberately, 
constructed to y;eld measurements that are directly interpretable In terms of 
specified performance standards" (Glasgr and Nitko, 1971, p. 653). * / 
^Scores from criterion-referenced tests should be directly interpretable in terms of 
actual student behaviors and abilities. Either the items or tasks are precisely those 
one is interested in assessing, or^the items on the test have been shown tArairectly 
relate to the behavior of interest For example, on a criterion-referenced reading 
test one would be able to relate performance on a given set of items tp the child's 
^ability to read and comprehend passages of established difficulty or specific books'. 
Such art* interpretation would differ frpm that in a starfdardizecr test. 'situation 
wherethe specific skills and abilities of a child scoring at a given grade Jkvel could, 
riot be we!! defined Critenoo^eferenced tests can be used {or certifying per- 
' formance (e.g., a llfesavmg test), for direct instructional feedback or for summative 
O .rposes. * j[7 ^ 



Diagpostic Testing * • , 

A last type ot test that has relevance to teachers is diagnostic testing TypicaHy, 
diagnostic testing attempts to provide an assessment of a student's present 
strengths and weaknesses with regard to a given area of achievement Diagnostic 
tests may be very similar* in construction and intent to formative and/or 
criterion-referenced tests However,^ additional factor involved in diagnostic 

^ testing may be the desire to f inc^u? oaftse <pr etiology, of difficulties in learning. For 
example, a child having difficuMparning to read might be assessed >with regard to 
vision, hearrng and other percefTOsn functions. Possible "mptivational or emotional 
factors associated with reading or learning might also be explored In diagnostic 
testing it may be necessary to go beyond course content in order to determine the 
best ways to begin remedying some difficulty in learning Sometimes direct 
instruction on prerequisite learning is quite sufficient, but^t other times additional 
ettorts are necessary ^ & 

In my opinion, the^ development of ^criterion-referenced, formative and 

^diagnostic tests hold promise for teachers and students kwpufd expect -f^Wer 
dangers and limitations with such tests than one would expect with standardized 
achievement tests These newer forms of tests are meant primarily'^ aids in the 
instructional process and can be related much more closely^ to the actual 
classroom learning of the child than are standardized tests There are difficulties in 
constructing such tests, however, and we need more experience before we can feel 
confront that their promise as instructional aids will be fulfilled." 

Tests and the Instructional Setting 

In considering the purposes for which tests might be used* a central issue is the 
.relation between testing and the instructional program. Most currently available 
tests ar^eared toward instructional settings in .which the curriculum is specified 
and m.wfiich <*ach child is expected to master the same fnaterials and objectives as 
his classmates. As we move from t$acher.-cer\te f red to chHd-centered classrooms, 
from group instruction to individualized instruction, from a fixgd to a moreWuid 
curriculum, the whole enterprise of testing must be reoner^tecf and .te^ssessed. 

It is possible that the entire role of testing as we usually/ conceive it has little 
place in more informal educational environments..! beheve that persons involved 
in changing educational environments must be* able^to specify the areas of f uman 

- learning, growth and development they believe to bfe /mipqrtant All educators 
have an obligation to systematically document growth and fhange in children 

\lnformal educators should be pressuring those in the field, of testing and in 
research posttions'to develop ways of assessing the important aspects of growth 

y and development that informal education tries to foster I do not pcetend that the 
task of assessing the goals of open education will be easily solved, the behaviors* 
.and outcomes are very difficult to measure and may not be opon to classic psycho- 
metric approaches Nevertheless, l^believe test [SuSlish^rs an<£ evalftators can be 
responsive to educ^tionaf change and might well have the resources to begin to 
solve some of the difficult measurement problems involved. * v ^ 

Undoubtedly, to specify and assess pu\>'l growth in informal settings will be 
more difficult than in traditional schools. One major problem is how to cfeal with 
the fact that informal education fosters a diverse set of outcomes and activities 
(Karlson and Stoclolsky, 1973). Nevertheless, I think the basic notions of formative 
and criterion-referepced tes.tfng could be applied to children in certain areas of an 
informal curriculum, such as in the development of math and science concepts, 
the developmajjfof reading skills, and with rgspect to certain general a&as of 
problem-solving and Critical thinkjfitt* Systematic use of work samples* and 



m ♦ . 

observational schedules as well as interviews with children are , also promising' 
rffethods for the informal classroom. A recent article by Hawes (1*974) offers a 
collection of many useful evaluation alternatives 

More Broadly Conceived Tests Aje Needed 

The last area I would like to discuss in regard fo testing is that of content or 
domains. Most of this paper, has implicitly assumed that the major area erf 
assessment in regard to children is academic, subject-matter achievement As 
teachers we have often been committed to more than cognitive outcomes for our 
children, but raVely do we systematically coHect evidence about growth outside 
"the cognitive area. As we begin to broaden our .entire view of schQoling and 
learning, we should simultaneously attempt to incorporate into our evaluations 
aspects of growth that are affective, social, emotioha'Dfanciful, creative in nature 
I agree with others (e g ,, Chittenden). that one should attempt to evaluate these 
aspects of functioning in the context of subject matters or'areas of activity Thus, 
rather than taking a tra<t-hke approach, one would work within the concept ©f the 
child as a whole person and. his activity as incorporating many facets of 
functioning. Despite the ^difficulties, we must attempt %to assess children's 
development of interests, their capacity and modes of learning on their own,, 
patterns of social interaction and the like - 
. It seems to me that a continuing dialogue between teachers, nesearchers and 
persons involved in test construction could help broaden the options available to 



us a 



A - 



a ii 

From the viewpoint of informal education- in particular, we neeti to alter the 

traditional situation in which standardized tests are ofterv powerful shapers of 

curricular content To break out of such a pattern, teachers must be able to 

articulate their views about growth, to use their observations as illustrations of 

^rtfwth, andtochoose.consistently when their educational purpose is compatible 

with the purpose of a p^tkular assessment procedure 

•* . * 

" i — 

% References * 7" 

. AnasUs:. A Psychological Tests Uses and Abuses " reaches Cp//ege Record 62 (1961). 389-93 ' 
Block, j H , Ed. /Mastery Learning Theory and Practice. New York HolvRinehart & Winston, ^971 ■ 
• Bloom, BS , J/T Hastings, & CF Madaus Handbook on Formative- and Summative Evaluation of 
Student Learning New York McGraw-Hill; 1971 
Buros. O K The Sevenih Mental Measurements Year Book Highland Park, NJ Gryphon Press, 1971 
' Chittenden, E A *& A M Bussis "Open Education Research- and* Assessment Strategies " In'Open 
Education A Sourcebook for Parents and Teach^EB Nyquist&GR Hawes, eds New York 
Bantam fiooks, 1972 Pp 360-74 ~ f - 

Giasser R & A) Nitko Measurement in Learning and 'Instruction " In Educational Measurement, 

2d ESp, R L Thorndike, ed Washington, DC American Council on"Educa,tion, 1971 *Pp 6^5. 70 ^ 
Haw<?s/, G R Testing, Evaluation and Accountability Managing Open Education " Nation's Schools^ 

93,/b (1974) 33-47 ' * . ' - ' 

Johnson, O C , & J W Bommanto Tests and Measurements in Child Development A Handbook San 

f/anci'sco Jossey-Bass, 1971 
Kaflson, A L , & S S Stodolsky Predicting School Outcomes from Observations of Child Behavior in 
• . t<0 /Classrooms " Paper presented at annual AERA meejting, New Orleans, Feb .1973 

Ka'Ve K IQ ' A Conceptual Deterrent to Revolution In Education " Elementary ^choot journal 74 
j ^1973)^9-23 ■ . • • X ' 

I Srriven M The Methodology of Evaluation " AERA Ponograph Series on Curnqulpm Evaluation 1 
. / 1 1967). 523-49 , " % . 

/ Shapiro E Educational Evaluation Rethmking theCriteria of GoVnpetence " School Review 81 (1973) 
/ * 52349 • ' 

. / ' Stodokky SS Defining Treatment and Outcomem Early Childhood Education " In Rethinking Urbin 
j Education, H j Walberg & A T Kopari; eds San Francisco. Jossey-Bass, 1972 Pp 77-94 

/, * Weber G Uses and Abuses of Standardized Testing in the Schook+W zsn\n&ton, DC Council for Basic 
, ' O Education, Occasional Paper #22, 1974 Cied by oermission \ 

EKjC 19 ' . 



V 



Understanding 

the Gobble-dy-gook 

Michael Quinn Patton 1 • ^ 

Sfaf/st/ca/ thinking will one day be as necessary /or efficient citizenship as the 
ability to read and write, H G Wells 

Standardized tests now pervade our lives in ways never drearned possible by 
earl\ pioneersjn educationaj measurement only a century ago Production of new 
tests is occurring so rapidly that even specialists appear to be overwhelmed. The 
dimensions of this explosion are indicated by the fact that in 1972 the Educational 
Testing Service Collection held 680 different tests in just one category alone—' 
reading The Seventh Mental Measurements Yearbook (Buros, 1972) is composed 
of two fat volumes describing the vast literature on tests and measurement. At the 
"same time testing methodology and theory are becoming increasingly complex 

•At the 1972 Invitational Conference on Testing Problems, Henry S. Dyer noted 
these facts and put forth "Dyer's First Law hf Information Dilution * which states? 
that, as knowledge expands,while the population of potential users ofrAnow/edfe 9 *' 
also expands, the probability approaches unity that everybody j$ ignorant of vmat 
an\one el$e knows In ot(}er words,, the great Majority of test users simply does not < 
have tjnQ time to lookup or catch up or keep up with the enormous number^of tests 
and the mountainous literature that the testmakers continue to pile hp (Dyer, 
1973 91T" • ' ^ * ' r t ■ - ^" . '* 

Standardized tests have been developed for almost. any cognitive, affecUve^or 
social human trait you can think of, from intelligence to alienation, self-concept to 
maturity, moral development to creativity These tests are being used to select 
people into, andout of a wide range of educational programs, private and public 
projects and a variety of jobs— .often without knowledge or understanding of how 
the tests ace being used on the part of those being tested Standardized testing has 
become a socio-political tool, deciding the fate of both individuals and entire 
educational programs (cf Manning, 1969, Kirst and Mosher, 1969, Cohen, 1970, 
McDill, McDill and Sprehe, 1972) Again to quote Dyer (1973:86)- 

The field of education lias become strewn with politics, and educational testing 
ha? become an instrument, if not a weapon, in the political process. And this 
means that our worries today about the mishandling of tests and the misuse of test 
scores must embrace not only school persopnel^ but' also politicans and the diverse 
and pluralistic constituencies they serve. 

1 I am indebted to Dean V itu Perron*. Center forleachinn and Learning, University of North Dakota, and participants in the 
North Dakota Study Croup on Open Education meeting in Chicago, November 9-10, 1973, for encouragement and helpful 
suggestions in writing this article Support from the National Institute of Mental Health in the form of a postdoctoral fellow 
ship m I valuation Methodology was also helpful Comments bv Barbara French, University of Minnesota, on an earlier draft 
of this paper were particularly useful 



r 



This also means that, as H C Wells predicted individual teachers, parents, 
students and administrators need to make basic statistical thinking a part of their 
personal survival kit As Darrell Huff (1954) noted in his primer, How To Lie with 
Statistics, the abusers already kn'ow the inadequacies and tricks of statistics,, 
"honest men must learn them in seltyefense'Mp 9) \ 

The Meaning of Numbers: interpretation , 

The t'irsft thing to keep m'mmd j^hen mterpijlting standardised test sxores is that, 
even at their best^ey are onforough mdibators Y ol some human characteristic, 
Anne Anasjasi (1973 xi\ha$ noted the danger of focusing s % o much on fests^md test' 
scores'that we lost sight of the actual behaviors that matter to us 

The widespread misconceptions about the so-called IQ provide a particularly 
flagrant example of such* a dissociation. One still hears the term "IQ" used as 
though it jeferred,/not to a test score, hut to a property of the, organism 
In other words, the numbers that come out of standardized tests are not 
* embedded in jhe genes or on the foreheads of students t They are only rough' 
approximation* Of sbme characteristic at a specific point'in the time under partic- 
ular conditions' Jes,t results are only bne piece of information about £ person or a 
group-a piece of information. that must be interpreted in connection with other 
information we have about that person or group. ' * ■ 

Test scores then are neither good nor ba,d« They are pieces of information that 
are subject to considerable error— and that are more or less useful depending on 
how they aregathered, interpreted, applied, abused and used In this^context let us 
look at'some of the more frequent types of sCqres reported. 
•* . 

v Norms . 

Let et'p&rent read, as m£ny have done in such- places as Sunday rotogravure 
« sections] that "fi child" learns to sit erect at the age of so many months and he * 
%me think^monceof his own child. Let hfs child fail to sit by the specified age and the 
^Jtotentftiiisi conclude that his offspring is "retarded" or "subnormal" or somq&tlng 
equally invidious.* Since half the children are bound to fail to sit by the time 
mentioned, a good many.paipnts are made unhappy. Of course, speaking mathe- 
matically^, this unhappiness is balanced by theT/cy of the other fifty percent of 
\ parents in discovering thattheir children are "advanced " But harm can come of „ 
' the*e(forts of-.the unhappy parents to force their children to conform to the norms 

* "andlhus behackward no longer Hardly anyone is normal in any way . . .;. Con- 

' fusing "normal", with "desirable" makes it all the vyorse. Darrell Huff (1954 44-5) ■ 

"Norms" are scoTes that provide a comparison foKjnterpretirvg how one pupil, 
school or TcfiboT system compares to some othe^roup- Norms provide 
information about how children. from some comparison gKwp actually performed' 
on a particular test, not how they>ought to have performed? rWms can be reported ( 
in several ways — percentile ranks, grade equivalents, stanines, scaled scores, \ 
difficulty — coefficients and quartile points, among others The important point to 
keep in fnmd is that alhof these ways of reporting results are based on the same 
. basic information Some are simpler to understand than others, some are more 
useful for special purposes than others- butairof them^re ways of looking at how 
an individual or group. perfdrmed conipkr£d to how^he norm group performed. 
Such "norm-referenced" tests£opstitute the most common type of standardised 
O t ancfare the only type we shall discuss here + 

eric • . 



19. 



Norms,are used for interpreting test^ 
pf correct ^nsweh on a test) have very/ 
have different' number^ of items and a 
are unhkely For example, in the nin 
School Achievement Examinations, 3 
stucte nt in^tfre 99th percentile Or c 
reading (1964 edition) where-£ beginn 




v. 



ci nur 



Se raw scores (i e , the tot^l number 
meaning in themselves Different tests 
lly designed so that high raw scores 
dfiLjflathematics ,test, Minnesota High 
t answers out of 70 items pfaces. a 
the ^tenford Achievement Test m 
gr^d£ child need answer correctly 



only 33 of 60 items to be exactly at grade level in 'paragraph meaning For the teat 
{ results to.hav'e meaning, ra 
usually involving comparison to $|fne norm group, All major standardized tes£ 
include norms for a representativejnational sample of pupils For many tests,' norms 
, are als6 available for the state, t$e county, the city, and/or even a specific local 
school A IT 

Standardized tesfscfcres are somewhat easier Jo interpret when .you know a bit 
about how they are de$ignedJ|Horm-referenced tests are designed so th^t* a 
national, representative sampl|/i>| students taking the test will fall alongwhat i£ 
called a "normal" curve On 



middle scores on the scale (oj 
higher or lower levels space* 
below the average Diagram I 
^pot race,? 

Two points are particularly, 
standardized tests First, and; 
half of the children score 
seconds is the average Aboi 



Jpi a curve, most students will bunch near tfVe-, 
pear the average scor^)., with those performing fat 
lore .toward the extrerrter of the scale abovefoV 
fbws a normal curve for 100 runners in a./10-second 



ILLU$1 



m ortant with regard to thjs diagram as it applies>to 
$st important, the tests* are designed sQthat about 
f/ovv 'average." In the ''diagram,. 220 feet run in 10 
jalf of the duldten run slower tjian this; an,d abp^Jt* 

DIAGRAM I ^ . : 

J'ATION OF A NORMAL CURVE * 

SECOND FOOT RACE ^ 
100 Runners t - 6 



145 



170 



' Iff 

Below Average ' Mf f 


Average 


Above Average 


23% If 


54% 


23% 




I 

p 

' F t 
ff 


I X 


Mi* 

m 







220 



245 



270 



295 



number of feet run in ten/ seconds 



0 



2 4 



number of people behind a runner at this point 



20 30 40 50 60 70 80 85 90 94 96 



98 



99 



AVERAGE 



half as fast,or faster A runner going 220 feet in 10 seconds is at the 50th percentile, 
or right at grade level for this group of runners Grade level is sirrfply the middle 
scorey half of the children in the norm group must be at grade level or below and 
half at grade* level or above This fact should fye kept in mmd whe£ irrterpreting. 
results for different children 

Grade-level equivalency scores agarn reflect how a national sample of students 
at different grade levels actually, performed oh the test, not how children* at a 
particular gracte level ought to perform The diagram shows how a group of children 
do ran, not how they ought to run Who indeed can say how fasf aH'childreri in. 
grade three shduld run? Specifying how children at a particular grade levetoughffe, 
perform w'ould require a statement of specific educational values and objectives in 
terms of learning theory and developmental psychology A consensus among 
educators on issues. pi values, objectives and principles of educational psychology 
for children at different grade levels m school is still Jacking ' 

^f+tus, standardised tests reflect how children typically perform on particular 
items,,with tests designed so that half of the n*rm group scores below the average 
scor£. These tests are strictly comparative measures They do not indicate what a 
child should be able to achieve in any'absolute sense. We shall return later to this 
point, w,hich is crucial to an understanding of both the uses and abuses of 
standardized tests m . «,\ 

Another way of reporting scores (also illustrated by Diagram I) is simply to c^vide 
the diagram (or normal curve) into parts and report what part the student falls in. 
Diagram I in,di*ates a division into thirds, with -the lowest 23 percent designated 
"below average," the upper 23 percent "above average," and the middle 54„percent 
"average." This division is completely arbitrary. The diagram and scores could be 
divided into fourths (quartile scores), ^nto fifths, or into however many parts we 
•felt wouJd b£"use£ul . * 

Percentile ranks are a way of dividing tests mt^lOO parts based on 100 percentile 
points Percentile ranks do not give the percentage of correct answers, rather, they 
show what percentage of pupils^ in the norm group (national, state or local, 
whichever norm group is usfcd) scored at or be/Ow a certain level For example, on 
the Stanford Achievement Test in reading (paragraph meaning) mentioned earlier, 
a beginning third-grade student with a score of 33. 6ut of 60 would have a 
percentile rank olU2 percent This means that 42 percent of the children who took 
the test at the begtanirig of the third grade m^he Stanford national sample scored 
33 or lower on the $est, 58 percent got mor,e than 33 out of 60 correct answers. 

Percentile scores are shown along the bottom of Diagram I as "the number of 
people behind a* runner at this point." This means that, for example, aft^r/lO 
seconds of running, 50 percent^ of the children have run 220 feft, thusi a raw score 
of 220 feet places a student at the 50th percentile with half the students being 
faster than tliat and half being slower. ******* 

Another common scoring system used by teachers is the/"stariine" system A 
-stanine scoring systenrdivides normalized test scores into nine'parft (hence the 
name— sta for standard, nine for nine-point scale or division). As shown in 
Diagram II, a score of five is average on a stanine scale with the scaje divided so 
r that.40 prercent are below average and '40 percent above. (There are statistical 
features of stanine scores related to how they are computed'that give them, some 
additional specialised uses.) 



IC 



2 I am indebted to Dr Ralph H Johnson jWctor of Guidance and Assessment Services, Minneapolis Public Schools, for 
O Ming this diagram, ** jjjP 23 



v' . " 

^ J • •' 

s DIAGRAM'll . 
THE STANINE REPORTINC SYSTEM 



ft 



Below , 
Average 



Above 
Average 




PERCENTILE 



4-^11 T 23* 40 W 77 89 % 



Ihe important point about these various ways of reporting test scores is thdt the 
number of divisions used is totally afbitrary depending uporf particular uses, 
personal preferences and, as often as not, tradition. Halves, thirds, quartiles, fifths, 
100*percentile points'— each of there reporting systems has 4dvantages and dis- 
advantages All are based on different ways of comparing test results to a norm 
group 

Range Scores 

One of the greatest abuses of starndardized test scores is the tendency to focus 
on a single result (e.g ,, the student is at the 40th peTcen^ile) instead of on a range of 
scores Thinking about a range of scores (e g , 35th to 45th percentile) is important 
because it calls attention to, the fact that all test scores are subject to error. For 
many reasons, all tests involve some measurement error Henry Dyer (1973) tells of 
trying to explain to a governmental official that test scores, even on the most 
reliable tests, have enough measurement error that they must be used extremely 
cautiously The government official, whp happened to be $n enthusiastic 
proponent of performance contracting, responded that test makers should "get on 
tfie*ball" and start producing tests that "are 100 per.cent reliable under all 
conditions " * ' . * 
a Dyer's comments on this conversation are particularly relevant to an under- 
standing of error in tests. He asks 

How does one get across the shockiqg truth that 100 percent reliability in a test is 
a fiction that,, in the nature of the case, is unrealizable? How does one convey the 
notion that the test reliability ptiQblem is not one of reducjng measurement error to 
absolute zero,, but of minimizing it as far as practicable and doing one's best to 
estjmate whatever amount of error remains, so that one may qct cavtiously and 
wisely in a world where all knowledge is approximate and not, even death and taxes 
are any longer certain? [p?87] 4 • x , ' 

All reputable test materials include a "standard error or measurement," which is 
an ir^jex of the precision of measurement for individual students This number 
should be both added to and subtracted frorrrthe actual raw score to set the raw 
score range I n our Stanford Achievement Test example the standard error is 2.5 for 
grade three. Thus, for our^student who scored 33 out of 60 the range for the raw 



24 



score would be 30 5 to 36 5 On paragraph meaning, the student grade* equivalent 
is* between 2 8 and'3 1, and the percentile ranking is* between the 32d and 50fh* 
percentile , ^ 

¥ ' The fonge avoids putting too much emphasis on a singly score— an important 
caution since sq^many pupils cluster around the middle oj {he t£st that the 'answer 
to one question can rajse or lower a percentile*ranking considerably For example, 
continuing to use the same Stanford^Achievement Test example, if our ajtudent who 

* got 33 out of 60 correct (42d percentile) had answered just one more item correctly, 
he or she would have jVrtfRed 8 percentile points" to the 50th percentile, on the 
other hand, missing one more item (a score of 32 out of 60} would have dropped 
the student to the 36th percentile Thu^ a difference of one question right or 
wrong covers a)range of. 14 percentile points. ' > 

Crade-equiv^ent scores sometimes allow for similar large jumps at the extremes 
on the scalp In the-Stanforcj Achievement Test cited earlier^ correct answers out 
of 60 is equivalent' to the beginning of sixth grac^e (6 0),pne additional' correct 
answer mo^es the grade equivalency to 6 4, nearly halfwafinrough sj^Jfi grade, Y e * * 
another additional co/rect answer and the grade equivalent becomes 6 9, while 58 
answers out of 60» is more than halfway through seventh grade, By answering 
correctly three additional questions at the upper end of the scafe, a student can 
jump a grade and a half in achievement.- 1 * ' ' . 

A gocjcKjeat of misunderstanding about and abuse of test scores, norms* or 
averages can be avoided if the focus of attention is on a >ange of results, not a 
single number Moreover, adding and subtracting the standard error of 
measurement to^a score assures us onfy that, statistically speaking, we can expect 
the range to intlude the true score # two-fehirds of the time Test st^jstics represent 
probabilities,, not certain results: One-third of the time thesis a (fiance of error 
even with a range based on the standafd error of measurement 



Error 

Round numbers are always false. 

Samuel Johnson • . - 

Sources of error are many. Tj&e health of the child oathe day the test is given can 
affect 'the score Whether or not the pCipil hacf breakfast can make a difference ' 
Noise in the classroom, a" sudden fire drill, whether or opt the teacher or a stranger— 
gives the test, a broken pencil, and any number of simila^Jisttirbances can change 
a test score The mental state of the child-*depression, boredorn, elation, a 
conflict at home, a fight with another student, anxiety about -the teit, a low % 
self-concept — all of these factors affecf how well the student performs. Simple 
mechanical errors such as marking the wrong box on the test sheet by accident, 
accidentally skipping a question, or missing a word while .reading are common 
problems for^all of us Students who have trouble reading will perform poorly on 
reading tests, but they are also likely to perform poorly on social studies, science 
and arithmetic tests, because all of these tests require reading Thus the test may 
considerably underestimate the real knowledge of the child. 

Some Children perform better on tests because they have been taught how to 
take written tests Some children" are yrnply better test-takers than other children 
because of their background or personality or how seriously they^ treat the idea of 
'the test. Some schools make children sit all day long taking test after test, 
letimes for an entire week. Other schottrgive the tests for only a half day or 

ir »d •• . 



ERIC 



• r , • r- 

i ' m 9 

two hoursfe day to minimize ffttigue and boredom" Some children like to take tests, • 
some dG not Some teacft&rs help^children Vvith difficult words, or even read the,, ' 
tests along with the children, others do not Some school devote their curriculum, 
or at least some school time, to teaching students what is in, the^tests Other 
r • schools, notably alternative schools — open classrooms, free schools, strell 
academies — place little emphasis on test-taking and paper-and-pendl.skills, thi 
giving students less experience in the rigor and tricks of taking tests 

All these sources of error— and we have scarcely scratched the surface of such 
po&ibilities — can seriously affect an individual child's score Moreover, they have 
^ virtually nothing to do'with how "good" the test is, how carefully it was prepared, » 

and" how valid its content is for a given child or group Intrinsic to the nature of 
. standardized testing, these errors are a/wa \s present to some extent and are largely 
uncontrollable. They are the reason that statisticians can nev^r develop a test \hat 
is 100 percent reliable. 

The errors are more or less serious depending on how a test is used. When 
✓ looking at test scores for large groups, we can expect that because of such errors 
i " some students wjll perform above their true level and others students will perform 

below their true score For mo$t groups, statisticans believe that these errors cancel 
each other.The overly high scenes of some students compensate for the overly \o\y 
scores of others so that-fche group result is relatively accurate The larger the groups 
^ tested, the more Dkely this is to.be true 

■ However, for a spe<?if4t individual, no other scores are available to make up for 
ATT ^e error in hts*her store The orjly hope is that the questions the' student answered 
wrong because of error will be compensated for by the Questions fa$/she got right 
either accidentally or by guessing This type 'terror 'cornpensation is much less 
j rehable in correcting for error Ifian the situatfon described for large groups The 

least reliable result is one individual's answer on a single question Nothing can 
icompens^te for error in this case Thus, one must be extremely cautious about 
^ making too much qt;results tor individuals particularly on single, specific t|?st- 

questions and shorftests . 

Bias and Invalidity * 

, It ain't so much the things, we don't k^nowthat get us in trouble. It's the^thjngs wq 
- know that ain't so. Artemus Ward * - • " 

All the above sources of error have virtually nothing to cfo with the actual 
content of tests. Even the best of tests, carefully prepared by the best-trained 
professionals, are subject to the kinds of errors we described above. Unfortunately, 
th^ tests vt-hemselves can also be biased or invalid Whether or not* a test is valid 
depends on whether or not it measures what it as supposed to. Many tests are 
biased in favor of white, crriddle-class people so that £he tests are as much a 
measure of ethnic, racial or social class ortgins^as of reading, arithmetic or 
intelligence. Rural children will have difficulty with a test aimed at and^based on 
city life, and city children.will have trouble with tests containing a rural bias Many 
l4 ' - people feel that alf standardized tests are culturally biased to some extent. This 
* r< severe problem of cultural bias is the center of much controversy among educators 
(cf J McV. Hunt, 1973, Lavm, 1973; Willie, 1973, Meier, 1973) and is a major 
factor to be considered in attempting to interpret the meaningfulness of test 
' scores ' 

Another source of content errdr, bias and invalidity is simply poor preparation of 
test items Questions that are ambiguous and unclear, have more„than one possible 



Z6 • V 



right answe^r are based more on ideology than on logic or facts are far more 
' common thktt6ne might suspect Such questions can seriously effect test results 
and are extremely unfair to students and others taking or using tests 1 
Cood te$t questions are,extremely difficult to write Thewiter may be unaware 'v 
of his 'her own bias Unfortunately the only way to (jeterrnjne if the test questions 
are really farr for* your child or.ypur classrporn situation is to discuss very carefully 
, every item on a test with every Ghild, to understand what the items'mean to the 
children Even this' method is subject to error 'depending upon how,fcware you are 
of yolirown biases Spme examples may be helpful to illustrate comn$on sources of 
test invalidity " 

Is there more than one right answer? Consider the following question from the 
Metre*£phtan Achievement Test (MAT) used to test reading 5 for children seven 
years old* 

• * 

To keep means fo □ carry □•4iold 

. After this test Derrick said; "When I want to keep something, I carry it." "No," 
said Yvette, "wfien f want to keep something, I hold it." In reporting these 
children's comments Peborah Meier, Herb Mack and Ann Cook (1973) suggest that 
"these two remarks, tell us.that children differ in the way they reason" Different 
m wavs g| reasoning make test items ambiguous and unclear. Such iteiYis do not 
appear ttrbe good measures of readmg ability. e ^ 

Is the question more ideblogical.than factual? Consider this item from a ninth- 
grade social studies test '{Minnesota High School Achievement Examinations/ 
. 1974): 

Prejudices are^nost frequently the result of ^ 
A being born of foreign parents ' v w ^ < *. 

B living in communities with mixed racial groups * - ^ 
C inadequate information , 
• D personal experiences i - 

E parental influences on children ^ * 

Most sociologists, myself included, would argue that alf these answers 4 are true 
under certain circumstances Few sociologists, myself 'included, would agree vvttji 
the^testmakers that "C" is the best factual, answer. " • ' . J> 

Is the question rrieahmgful and refevant or trivial and esoteric? Consider this 
question, again from -the' same ninth-grade social studies test* 

v. ^\ * • ' 

The person with the lowest level of mental ability would properly be classified as 

A an imbecile % * ' 9 

. B ,an idiot ^ , 

C. a moron ^ ^ ^ ' f t c ; 

D a slow learner I » * w * * ' 4 '''„'- 

' E normal - ' . 

* « ' 

It would seem to me that one could, seriously question the relevance of this 
question in a social studies test. The question tests rather trivial factual knowledge 
about vocabulary and technical distinctions. 

These examples point to merely a few of the problems in writing good test* 
questions that actually measure what they are supposed to measure. Tedious 
though rtrnay be to famine all the items on a test, if you are to properly interpret 
it, you must find out whether or not you feel the test questions are valid and 
meaningful One.good way to do so'is to ask the students who took the test what 
rw J9^f me&nt.when they answered the questions. 

ERIC ' &7 



A Final Note 



lu)t 



But the very hairs of your head are all numbered. ' 
Matthew 1030 

We began by noting that standardized testing now pervades our lives Standard- 
ized tests have been developed for almost every human characteristic one can 
"ame. But testing is still a developing art subject to considerable error. Test results 
are only one prece of information that can help us understand and claftfy our 
personal observations But test results are easily abused, misunderstooGTand 
misused, especially when applied in making vital decisions ^bout the lives of 
individuals 

Civen the state of the art, standardized tests are no substitute for your own care- 
fully considered observations abouf^children you personally know. 



References . ( *r. • 

Anastasi, Anne "Preface" Assessment m a Pluralistic Society. Proceedings of the 1972 Invitational 
Conference on Testing Problems Princeton,, NJ Educational Testing Service, 1973 

Eluros, Pscar K (ed ) The Seventh Mental Measurements Yearbook Highland Park, NJ. Cryphon Press, 
1972 

Cohen, D.K "Politics and Research Evaluation of Socjal Action Programs in Education." Review of 

Educational Research 40 (1970) 213-38. 
" Dyer, Henry S "Recycling the Problems in Testing " Assessment in a Pluralistic Society Proceedings of 

the 1972 Invitational Conference on Testing Problems Princeton, NJ Educational Testing Service, 

1973 Repnnted'by permission ^ ^ 

Huff, Darrell How To Lie with Statistics New Yo>k WW Norton, 1954 Copyright 1954 by Darrell 

Huff & Irving Ceis Reprinted by permission 
Hunt, J McV "Hererfity, Environment, and Class or Ethnic Differences." Assessment in a Pluralistic 

Society Proceedings of the 1972 Invitational Conference on Testing Problems Princeton, NJ Edu- 
cational Testing Servtce, 1973 
Knst, M W , & E K Mosher "Politics of Education " Review of Educational Research 39 (1969) 623-40 
Lavin, David E "Sociological Determinants of Academic Performance " In The School in Society 

Studies m the Sociology of Education, Sam D Sieber & David E Wilder, eds New York Free Press, 
. 4973 78-98 ^ 

Manning, .W H. ,f The Functions and Uses of Educational Measurement " Toward a Theory of Achieve- 
ment ^Proceedings of the 1969 Invitationat Conference on Testing Problems Pffnceton, NJ Educa- 
tional Testing Service, 1969 f 

McDill, Edward L , Mary S McDill, & J Timothy. Sprehe "Evaluation in Practice Compensatory Edu- ( 
cation" In Evaluating Social Programs Theory, Practice f and Politics, Peter H Rossi & Walter 
Williams, eds New York Seminar Press, 1972 Pp 141-85 

Meier. Deborah Reading Failure and the Test*. New York Workshop Center for Open Education, City 
College,. 1973- ' ■ 

Me"ier, Deborah, Herb Mack, & Ann Cook fading Te*>ts. Do They Hurt Your Ch//d? New York Work- 
shop Center foe Open Education, City C«*££s and Community Resources Institute, 1973 

Minnesota High School Achievement Exatfatttons, Form E, H Rev. Circle Pines, MN American 
Cuidance Service, 1^74 

Nunally, James C Psychometric Theory New York McGraw-Hill, 1967 f 

Profiles of Performance in the Minneapolis Public Schools Minneapolis, MN Division of Planning and 
Support Services and Department of Guidance and Assessment Services, Minneapolis Public 
-Schools, 1973 

\Stanford Achievement Test. Primary Battery and Directions for Administering. New York Harcourt, 
Brace & World, 1964 

Stanford Achievement Test Advanced Battery Teacher's Cuide for Interpreting N6w York Harcourt 
Brace JovanQvich, 1974 

Willie, Charles V. "A Theoretical Approach to Cultural and Biological Differences" Assessment in a 
Pluralistic Society Proceedings of the 1972 Invitational Conference on Testing Problems Princeton, 
NJ- Educational Testing.Service, 1973. , . • 



• • * 




Testing: Reform Is 
Not Enough! 



George E. Hein 



- / 



"'Evaluation is an integral^ part of the political processes of our society. 
\; r .~'-yL ErrfestR. House (1973)- . * \ 

x A number of recent publications haye sharply criticized the standardTzed 
achievement tests that form the basis, for most eyaluatioh of studertf progress in 
American education tpd^. )am?s McDonald f1974| has called evaluation \fte 
major disaster area in education", tfie Council for Basic Education (Weber, 1974) 
ana the National Council of Teachers of English (Venetzky, 1974) have published 
„ pamphlets highly critical of present standardized testing procedures. 
^ An obvious response to su/;h criticism is to undertake a major program to reform 
, or revise the tests There is certainly much room for improvement. The questions 
could be. better, the standardization, could be based on more representative 
samples of the population, and the tests could be validated against criteria more 
appropriate than the ones used More imaginative pse of the available technology 
could vastly improve evep paper-and-pencil macfiine-graded examinations. The 
whole notion that the scoring and administration-of alJ six of the most.widely used 
achievement tests is done on a basis. of total questions right in each area without 
any further modification is really quite absurd. Why not a choice of questions, or 
questions that relate to a wider range of skill, or the possibility of more than one 
correct answer in some cases? There is also no reason why achievement must be 
tested only by paper-and-pencil measures. A much broader range of activities 
could be standardized. . N 

Unfortunately, an^ effort to reform the tests avoids a thorough analysts of the 
fi }V - reaso "* w hy the ^sts are so bad now To assume that achieving better standardized 
£ »** tests is simply a matter of making changes in Jthe^ tests therpselves is, I believe, to 
} £ Z .hold a naive view of the education world 4nd the. society. It is highly unlikely that 
>v T* *OaN the people whq put tests together, suggest the questTons, yvrite the language, try 
, Jt(}6m out on children, standardize them and finally published sell them are all 
.totally unjDerceptive and uneducated, 'j&pfr that testing practice wilj J>e greatly 
- imferoyj&d if the technical competence^ its practitipners is increased,. We must 
/efcogn'jze that Jthe* tests and their, use ar$ deeply imbedded in the.* fabric of 
" American stjcfety and must be challenged on political grounds, not modified at the 

techrlical level. - \ v 

" \ 

Lessons from Curriculum Reform Efforts * *N? 

..Any proposal for a v major effort to produce, new* testing mechanisms is 
'niscent of "the programs launched alm^^wenty years ago to produce new 



\ERIC 



science and math curricula Scientists and mathematicians who turned their 
attention to schools in the late 1950s were horrified at the state of the situation the 
curriculum was simply bad, fhey said — full of error, wrohg concepts, incorrect 
statements, too much stress oivrote learning, simple drill, etq They set out to 
reform education and produced htgh quality, innovative and up-to-date curricula 

One of the major learning experiences for those involved in curriculum reform 
was that providing new curricula, although a necessary condition fpr better school 
experiences for children, was hardly a sufficient change. In fact, much of the new 
curricula was fitted neatly into existing school structures (indeed, it was designed 
to do so) and, instead of the curricula changing the schools, the schools absorbed 
the new curricula without much modification in the educational program offered 
most children In many cases the more innovative characteristics of the new 
curricula and materials were simply ignored While the New Math has had wide 
acceptance in the schools, it would be hard to recognize its influence on day-to- 
day classroom practices (Sarason, 1974) and difficult to discern it on the items 
appearing on the standardized tests. 

There is obviously some merit in developing more reasonable and wide-ranging 
approached to standardized testing, as long as one neither expects the task to be 
simple nor hopes to change education by this means alone. The area of developing 
alternative tests is wide open, remarkably little work has been done on it. The 
standardized achievement tests and their companions, the widely used intelligence 
tests, so dpmmate the field that little else has been explored and certainly few 
other approaches? have been carried very far 

The. Case of ttle Automobile Industry 

An analogy can be made to the automobile industry. At one point in the early 
development of automobiles in the United States, many designs and approaches to 
the*'problen\of mechanical energy-driven vehicles weTe explored, engines powered 
by ejettricity, steam and other fossil fuels fsuch as diesel fuel) competed with 
those developed to use .gasoline. The .gasoline-powered internal combustion 
engine was 50 successful, it spread so widely over the marRet, that many^ other 
technologies were, simply abandoned. Today we know a great deal about the 
gasoline engine that uses rather a lot of gasoline and very little about the alterna- 
tives. The recent sharp rise in fuel costs and increasingly "serious concerns over the 
automobile's role tn pollution makers painfully aware of the social costs of this 
unbalanced technological ^progress 

But another component of this analogy is not quite so innocent. The automobile 
industry evolved policies that channeled and directed research, production and 
expenditures in the direction of private automobile travel and away from mass 
transit systems which generalised different forms of locomotion. At the same 
time, these decisions benefited a particular sector of private industry They had 
profound effects on our society. A" recent Senate Subcommittee report states flatly 
(Boston Globe, March 10, 1974): s ' 

GM,, Ford, and Chrysler reshaped American ^ground transportation to serve 
corporate wants instead of social needs. This study suggests that a monopoly in 
ground vehicle production has led inevitably to\>a breakdown of the nation's 
ground transportation. - : 

Beginning in the 1920s General Motors began to buy up tail and electric urban 
transportation systems and then replaced them with buses or diesel locomotives 
which it manufactured. ' 



/ 

/ 

I 

, The same report also documents {Bbstjon Globe, March 3, 1974) that changes in 
stylmgjn the automobile industry through the years were not necessarily related to 
improvements in technology. 7 

VVe must recognize parallels in thexontmuing^use of large-scale standardized 
testing programs in our schodl systems.' The companies that produce standardized 
tests are analogous to the big three automobile manufacturers they dominate their 
market and dictate what is and is not acceptable Their outJook is limited by what 
they have found successful Commercial self-interest makes it very unlikely they 
will launch speculative new projects that might undercut their.own positions And^ 
like the big three automobile manufacturers, the publishers who produce testing 
programs are not isolated from the rest of society They have connections in 
schools of education, foundations and government that reinforce each other and 
thus tend "to maintain the status quo-just as the Atomobile industry has 
connections ia research institutes, regulatory agencies »d government 



Analysis of Costs \ 

One strong argument that fs continually made for mainlining the present 
evaluation system is based on relative costs. It is simply a great deal cheaper in 
dollars and cents to give the MAT (or one form of the Iowa Test) to, every child in 
the school system than it would be, for example, to introduce some sort of 
individual observation system to determine the status of each child. But the total 
expenses are sb different that no direct^omparisons can be made. 
\^ The cost of feeding the present testing machine is quite small in comparison 
with setting up another one, but that does not mean the total investment in it is 
small Besides the cost of the millions of test booklets, which are not reusable, 
there are a number* of school fc>ers6nnel, especially in large city systems (but 
smaller ones as well), wh c ose sole job is to organize) administer and interpret the 
test programs Teachers inc} children spend a good deal of time giving and taking 
tests In some Follow Through sites as many as six weeks of the spring term are 
essentially lost for instructional purposes while the classes go through tffe agony of 
taking the various required.tes.ts dictated by the school district, the state and the 
federal program The yvhole experience disrupts instructional activities for a 
month and a half (Xhat is about "18 percent of the total school year). In addition, a 
thorough analysis of costs must consider the human and social factors. The tests 
affect the content and approach of educational programs, they tyrannize teachers 
and demoralize students .Also, part of the cost is the incredible inefficiency of 
testing schedules. Typically, children are tested sometime in the fall and spring, 
and the comparative results are released very late in that year or, often, in the next 
year Teachers cannot even, use the tests for their own teaching purposes— they 
can on^* be 'used by outsiders, for purposes other than assisting instruction.. 

"* * » ♦ 

Social Implications of Testing 

None of the problems mentioned above would be seriously modifiedby the 
availability of better^tandardized tests/ 

The major function of the present testing programs is not to determine how 
mych children know, to diagnose theirjearnmg stages, and to assist them m their 
growth and development but, rather, to sort and classify children for their assigned 
roles in society for this' task the present system, with its shoddy and discriminatory 
tests, works quife well and-almost independently of the quality of the tests them- 



ERIC 



©'_?s! As Henry Dyer stated recently (1973): 



The-wdespread use of tests for purposes of selection for deciding from Kinder- 
garten on up who will pass e/nd who w/// fail,, who w/7/ be winners Aid who w/// he 
'losers, is not likely to go^way in a hurry. For, whether we likeit^or not, it has 
become indigenous to the kind of competitive culture tnat characterizes our social 
institutions, inciudingpur educational institutions, j 

In a historical analysis of the standardized testing movement, Karier (1972) has 
described how the use ot tests^to classify .children /developed in the 1920s and 
1930s reflected the prejudices Qf our society prevalent at that time Unfortunately, 
the same views still influence decisions today In a standard psychology text 
published in 1970 (Bernard), individuals are classified according to IQ into 
categories designated as follows. 



Intellectual 


Educational / 


Live Work 


IQ 


% of 


• Category 


Potential 


Potential 


Range 


Population 


Feeble-minded 


Uneducable 


Typically dependent 


* v 70 


1 


Borderline < 


Special Schooling 


Routine jobs 


70-60 


2 


Slow leacners 


Special classes 


Day laborers, 










routine^ jobs 


80-90 m 


16 


Normal 


High school, but 


Laborers, semi- 








perhaps with 


skilled jobs, 








difficulty - 


clerical work, 










some semi- 




50 






professional 


90-110 


Superior t 


High School, 


Skilled work, pro- 








some college 


fessional work foe 










some 


110-120 


16 


Very superior 


College 


Skilled work, pro- • 










fessional career 


120-130 


2 " 


Gifted 


Craduate School 


Professional,, '* • 










creative 


130 


1 



The social implications of the view that only 2 percent of the population/ 
selected on the basis of a series of standardized paper and pencil tests, definitely 
has the capacity for college work and professional careers and that at best 16 
percent of the population has the potential for "some college" represents a 
political judgment that would not be affected significantly even jf the tests used 
were less subject totechnical criticism. * ' 

The possibility of high test scores is held out to parents as a way of providing a 
great future for their children when, in fact, it would take a very high score indeed 
to change significantly the life chances of poDr children The complex relation of 
school to economics, tocollege admission, and to the job market is closely related 
to larger social and political issues, including the prejudices of our society (Berg, 
1970). 

What the tests encourage is a lottery concept of education. It is truest an 
unusually high-achieving child from deprived circumstances'^ child whVdoes 
very welj on the standardized tests— can break out of the bounds of the cla$s in 
which he or she lives and actually change status. But the odds against such an 
occurrence are enormous. This kind of case—and there are^me all the time— has 
the same effect on redistribution of classes in society that the lottery has on redis- 
tribution of income. The lottery in Massachusetts, for example, provides about a 13 
million to one chance of winning $1,000,000. That means that, after 13 million 
tickets are sold, o ne person ma y drastically change his or her economic status 

There are Just enough winner* of smaller amounts so that many people cart 



31. 



support the illusion that they too may be a winner, that* they too can change their 

status But, of course, the actual number is so small that thfe few who-break out of 
. poverty by wi rifting the sweepstakes is rnsignificant for any change in class 

"alignment Fxactly the same reasoning supports the' concept that good reading. 

scores will help populations break out oFpoverty or oppression The actual- number 
t of children who can change their status as a result of, school success is trivial 

compared to the total pop^Iation<pf those condemned to poor jobs and continuing 

poverty 1 , 1 

We can recognize that the tests do not necessarily reflect accurately children's 
* abilities and knowledge of individuals, by noting the numb^ v of exceptions -to 
expected results Every person active in education has his or her own store of 
anecdotes about Jane'who did poorly on an MAT but could do the work, of Fr'ankie 
who could read only on the second-grade level and, after two months of help, 
could read on the sixth-grade level, oi Janice whose IQ rose 25 points in a year In .> s f 

some cases where people have looked at children carefully and worked with them^ yl<^ 
sensitively, whole clashes and groups have increased their IQs (Rayder et al , 1973) r ( 
ot their grade level achievement phenomenally over relatively short periods of 
time In Reading, How to (1973) Kohl reports the case of Lillian, a child whose 
performance improved so much that it required the threat of a law suit to force the 
school to accept the results of three reading tests 

Summary 

J 0 before we advocate major programs to improve standardized tests, we have to 
recognize their role m the American educational scene. The tests are one 
component in the ^ftfhg ^Jstem of American schools. They contribute one 
element, but not the only one/oFte rtecessary condition, but not a sufficient one, to 
see to it that the schools continue the society as it is We have to be aware of the 
use of testing in the society, of the social and political role of the education 
system, and of the investment in the present evaluation structure Only if we are 
willing to reexamine the* entire structure and nature T)f the education of our 
children can we hope to achieve more equitable schooling that assists each child 
to'develop his or her full potential 

References . 

t ' ' 

Berg. I Education *qd lobs The Creat Training Robbery New York Praeger, 1970 
r Bernard H W Homan De\elppment in Western Culture, 3d ed Boston Allyn & Bacon, 1970 Re- 
♦ printed by permission-of tne publisher v ^ - ■§ 

Boston G/o^e, Mar 3, 1974 

\ Mar 10, 1974 Courtesy ot The Boston Clobe 

i Dyer, H S "Testing Little Children Some Old Problems in New Settings " Childhood Education 49 7 
(Apr 1973) 362-67 * 1 ^ . ' ' 

House, Ernst R School Evaluation. The Politics and Process Berkeley/. CA McCutchan, 1973 
Kaner, C J "Testing for Order and v Control in the* Corporate Liberal State " Educational Theory 22 

(4972) 159-60 - v • 

Kohl, H Reading, How To New York Dutton, 1973 P 20 v 

McDonald, J^mes B "An Evaluation of Evaluation " The'Urban Review 7, 3(1974) 
Rayder N , B Body, & C Nimnicht "Assessing Follow Through " San Francisco Ear West Laboratory 

for Educational Research and Development, 1973 
Rothchild, Emma Paradise ^Losf Decline of thp Auto-Industrial Age New York Random House, 

Sarason, S B The Culture of tf\a School and the Problem of Change Boston Allyn & Bacon 1974 
Ch 4 I t ' , * 

yenetzky. RL Test mg a nvJ Riding Urbana, IL National Council of Teachers of English, 1974 
Weber George "Uses ancHftuses of Standardised Testing in the'Schools ' Occasional Paper No 22 
q' shmgton. DC Council for "Ba£i£ Education*, 1974 1 # 

eric . •.. jp • 




Anothei^Look 
at What's Wrong 

with Reading Tests 



Deborah Meier 

A teacher listening to two children read the^assage, "He lived in a big house/' 
notes that one read, "He lived in a big aparttnent" and the other, "He lived in a big 
horse " 1 Are both equally wrong? In fact, many teachers, like many tests, are prone 
to considenthe latter mistake k \^ss serious, since it's "off" by only a single 
consonant rather than a whole word. | 

In our frantic effort to teach children how to beat the testing game we hatve lost 
sight of the purpose of reaafng. to turn the w,n£ten page into something that makes 
sense. Tests (formal or informal) are, at best, only symptomatic, a roundabout way 
of getting some*hints as to what students are doing and whether schools are helping 
them. But the nature of the tests we have devised and our single-minded focus on 
them have led to a decline in concern for the r^il act of reading with its power to~ 
explain, to influence and to move. (For exampUfamidst the presumably enormous 
concern to fn?prove reading in New York City, neighborhood libraries have been 
cut back to a few hours a day and weekend use has been eliminated.) In order to 
understand how this is so, we need plarificatiop regarding both the act of reading 
and the nature of testing. Particularly, we need (1) a definition of what we mean by 
reading, competence, (2) a closer look -at "the implicit underlying definition of 
reading that is embodied in current reading tests, and (3) some alternative means 
of assessing reading that would document better what is and direct attention 
toward what could be 

T^vard ,a Definition of Reading * 

4 We, pretend — tp parents, teachers and children— that it is enough that a child be 
•frilled into changing a set of visual symbols into oral ones, We act as though this 
action'w.ere reading. (It is such a commonly accepted definition of reading that the 
parents of fluent readers sometimes complain that schools are not teaching their, 
children phonics ) We call this behavior "decoding" although, as .linguist Frank 
Smith points out, actually it is mei%v translating frorn one code (visual) into 
another code (oral).2 Such a skill, useful as it may be, is a very trivial one. For 
example, I can do it for Spanish without being able .to understand a thing I am 
saying. 

We are in fact faced with a vast number of students who have made this first 
translation into oral reading. They can decode, yet they are still at sea We* are all in 
this fix sometimes when reading something we find difficult Ttoe. problem we face*; 
is not "breaking the visual to oral code." Our difficulty lies in the subject matter 
. itself or the language used to describe it. At such times we cannot translate the 
visual or the "oral symbols into significant meanings. Turning those marks on the 
page into meaning is what constitutes reading. To do this means bringing a lot into v 
the act of reading quite aside from what we know about the visual marks on the 
written ftage. "^3^" * ! 



When a college professor complains that, his students these days' do not even 
know how to read, he naturally jneans (although he may jtot realize it) that sucK 
students cannot make sense of the reading material he minimally expects of a 
college freshman The distinction between a 12.9 and a 6.9 r.eader (using the grade- 
level equivalents of the standardized reading measures), after all, is not that one 
can read and the other cannot The difference lies in what they can get meaning *' * 
- from or the different sorts of meaning they*take from the same ^material. What ' , 
changes— or what should change— over time is what we bring to the act of reading* 

Many experts have suggested as a definition of reading skill qua reading skilL" 
(literacy), the closing of the &^p between what one can make sense of orally and 
what one can comprehend visually. 3 Given such a closure, the school's task should 
be to help children make sense out of more of the world. 

, What Is Happening Back in The Real World? 

' Irvthe absence of an acceptance of the kind of definition of reading outlined 
briefly above, the tests themselves have become a kind of implicit definition. 
When we ask a teacher or parent about a child's reading, they all increasingly fall 
-back on the jargon of test scores. All corpn4ion-sense judgments are abandoned. ; 
• Even children begin to judge their reading a> though it were merely an extension of 
testing. And no wonder, when even the best intentioned of us b^gin to urge 
children to read in order to raise their test scores. ; V ' 

Those children who come to reading easily are tbe'least injured by this, although , 
they too are encouraged to focus narrowly on keeping ahead on the tests. Upper- Z ,Z 
gradje children who are already fluent readers, especially^ thfcy are in inner-city w w# 
schools, are often kgpt busy filling in blanks and drilling,on fifbskills that might 
appear on tests while the content of the world is skipped over as a luxury. >Good, 
books are used merely to teach test skills, "getting the meaning" or "inferential * . 
thinking." 

But the children who badly need the teacher's assistance are the most seriously 
handicapped. While we hammer away at skill tasks that appear on tests, we often 
deprive these youngsters of the kind of knowledge and language experience that 

• they badly need to bring to their reading. Even the skill-task^jthemselves are often 
justified by the teacher only on the grounds that they are necessary for the tests. 
They too may leave children as much in the dark as ever about how to use their 
own natural intelligence to work out the relationships between the visual symbols 
and' the world of meaning Worse still, these tasks convince some children not to 
trust their owp intelligent hypotheses, thus making it virtually impossible for them - 
to develop fluency. 0 

The task of making sense of the written word is a procedure very similar to one 
all children accomplished just a few years earlier— when they .learned to talk. It is 
well to remember that the children who enter school, including the most dis- 
advantaged, havfe only recently constructed and verified a set of bewilderingly 
cornplex rules that "summarize the relationships and regularities underlying 
language " 4 They succeed "#ven though adults are far from any understanding of 
what these relationships and regularities are, let alone how to impart them through 
formal instruction." 5 They used a process of trial and error— plenty of error! To 
encourage such experimentation we merely supported the never-stop noisemaking 
and monologue-like conversation of small children. We responded to the sense of 
what they were saying whenever appropriate. We did not categorize theiri at each 

1 successive stage, isolate the sounds or rules for their practice, count their errors ot 
O rict them on.the basis of some prior logicaj^otion of "sequence." We did not* 



need a test to know if they were progressing or whether indeed they knew "how to 
talk" when they came to school We have no comparable grade-level oral language 
standards — important as oral language is Nor do we confuse good talking with 
conscious knowledge of the way language is put together, intriguing as the latter 
may fce x ' * * A % 

Reading, like talking, appears to be logically impossible only when we persist in 
acting as though one needed to know all about it in order to do it We appear to 
believe that learning to read requires bpth a superhuman memory and a quite 
impossible memory retrieval system We appear to do so in a period of educational 
history in which we simultaneously admire the research of Jean Piaget, which leads 
us to conclude that young children's thinking is still very concrete and utilizes a 
form of logic that seems indeed illogical to adults Yet we try to teach children to 
read asthough they were indeed highly sophisticated and self-conscious computer 
programers (and as though our system of written language was "computerable") 
That it sometimes appears to work is a cre^^to the flexibility^nd tenacity of 
human intelligence — to pick out what it needs and discard the rest That it so often 
floes not lead to anything resembling the re;al act of reading is hardly surprising 

*But we persist in this view of our task since the one thing such an approach may 
indeed succeed in doing, at Ipast in the short turn, |s raise reading scores . 

Even when children get past the first roadblock and begin to fead with some 
fluency, new obstacles appear in the form of new test-related demands For what is 
especially vicious about this test-dominated approach is thatjt-never ends. There 
are always more tasks that will^nake you read (i e>., test) even better Af no pointy 
along the way is one allowed to say, "Goodness, he reads' On to other things " 

For example,, in a study of the Stanford Reading Diagnostic using inner-city 
Philadelphia s£venth-graders, it appeared that many children scored well on the 
reading comprehension subsection (the only part that comes close to measuring 
reading per se) but w.ere pulled down by low scoresion subsections measuring 
auditory discrima'tjon,, blending and syllabication 6 y\nce the youngsters' final 
scores reflected the sum of all the parts (and since teachers, children and programs 
are judged by such final scores), many odd classroom procedures naturally follow 

Conscientious teachers gfve such children remediation tasks on th£ subskills 
they tested poorly on They do so regardless of any evidence that this will lead to 
improvement in comprehension Children are drilled on recognizing similarities 
between certain isolated sounds ("Which word," the teacher might ask, "has the , 
same sound in the middle as«^at'— table, run, camp or seat?") They spend hours 
learning rules to help themydecide whether to break "riddle" into "ri-ddle," 
"nd-dle" or "ndd-le " Considerable energy is also spent helping students decipher 
test, instructions, since all these skills will be in vain if the student cannot 
demonstrate thenvsuccessfully on the test. Little time remains for reading or for 
other content areas The child has been trapped Any other course seems to court c 
disaster, being Labeled "below grade/' 
The Tests^ » 

•> It is critical to recognize tftat a test is based on assumptions regarding what is 
being measured and why'Ttre difficulty With our reading tests is thaLw^e have 
accepted the machinery of tye tests without having questioned whether' we agree 
with their implicit or explicit definition of reading or of reading progress Even less 
so have we agreed on how such reading is acquired £ln fact, the testmakers deny 
having either a def mition^or a theory. They are merely measurement men.) 

The tests are # constructed not from an explicit theory of reading but put of an 
eclectic potpourri of items whose justification lies in the fact tjjat they have a high 

36 



. ; ■ * ■ 

9 

degree of correlation with later school success, are consistent with other similaT 
tests, and produce a normal curve The midpoint along this curve then constitutes 
what the layman mist&enly assumes is the "should" of reading Does this sound 
too slipshod, unfarr, unlikely as a description? 7 . * 

Jn the summer of 1973 a, group of respected reading and testing experts met 
together in Georgetown, Washington, DC , under the auspices of the International 
Reading Association They had a hard time agreed about much, particularly 
about what to do or say to the public at large But there was, as Qne participant 
noted in-summarizing the conference, 8 virtually unanimous agreement that all the 
existing normative-based reading tests were without a theoretical rationale, had 
"little relevance for instruction and were not designed to measure or record 
educational improvement/; The experts agreed that the tests "both mask and 

^distort the real issues involved in the acquisition of reading skill" and that there is 
ttfday "no definitive knowledge regarding either the sequential learnings or 
component skills that children must acquire in order to read successfully^" They 
further endorsed the notion that "especially in the acquisition period" reading tests 
should be "program specific," testing^J'only what has been directly taught or 

* indirectly fostered/'' 
Alternatives * * 

There are many alternative forms of assessment No perfect fbnes exist for all 
^purposes and all programs 9 Yof example, the use of individually administered 
reading inventories such as the Spache 10 or the Silvaroli 11 are a step in the right 
"direction if we want rough comparative data on individual skill They also can 
providg some diagnostic information, although so can any teacher who gives 
attention to a child's reading. Kenneth Goodman's Miscue Inventory 12 is better as 
a tool for gaining insight into a child's individual approach, although too detailed 
and complex for everyday us£ ■ J 

Short, program-specific tests designed to fit the activities of a particular class- 
-room or program are manifold They often come with commercial reading systems, 
or can be quickly whipped up by a teacher to see if what has been specifically 
taught has been learned ^ - ' ^ 

Good anecdotal documentation of observed student language and reading is 
done by many good teachers and researchers and could yield rich information that 
is both diagnostic and suggestive if we chose to spend our time and money that 
way. n i 

For obtaining genera*! data on larger populati6ns,' programs or trends, 
particularly beyond the stage of minimal reading acquisition, random sampling- 
techniques applied even to^he existing normative-type tests would be preferable 
Random sampling would also make alternative, more individualized methods 
financially feasible and could thereby provide far) richer data. Incidentally, it 
would also avoid/the enormous and unbeatable problem of dieating that is 
encouraged under present circumstances, since it removes both the opportunity ' 
and the incentive to coach for the test or cheat during it. The English, havje, for 
ex^nple, given a short individually administered reading test to a sample 
population every ten years 14 While the English system of testing has also received 
criticism for archaicHanguage as well as methodology, it has at least attempted to 
develop comparative longitudinal data without distorting the educational process 
by the evaluation of it. v " * 

The problem of assessment, in fact,* is so equally well met by other and even 
cheaper methods than the current mass testings that one is led to conclude that f 
O ' e may be method to this madness 15 * ' 



"i 



ERIC 



Conclusion 

Most^standardized reading tests play a negative role They discourage us from 
using schools to help children become readers Those with other resources learn 
anyway, those least advantaged are as usual stumped . ' " # 

The tests encourage us to fall for the notion that reading is mostly just a "trick," a 
useful one for "getting ahead'' in the "real world" rather than a means of expanding 
oununderstandmg of it * , \ 

It is well to remember that schools cannot get everyone^above the median It 
stands to reason, given other facts of life, that the least advantaged will more likely 
tall below the median than above This natural fact of lite is reinforced by the 
nature of any standardized instrument weighted, as it must be, toward the culture 
and associative patterns of ,the mainstream child 10 v V 

Still schools could succeed in making almost all children g^rod-^eaders This 
appears otherwise only so long as we define good reading as a point on Ure"rtOTO|[ 
curve ff we tall for th^t frame of reference, it is indeed a logi.cal contradiction to 
seek improvement 

To say there is no intrinsic necessity for the poor to turn off written language is 
not to pretend that we could alter the class structure, achieve economic and social 
equality, produce vast changes in patterns of mobility,, or "even" reverse the 
locations of socioeconomic groupings on a normative scale by making all children 
competent reade^ ButTnerely because we cannot achieve all this just by helping 
children be readers does npt mean it is not worth doing 

References 

1 This marvejpus example was raised by Frank Smith in a seminar on teading held by the City College 
Workshop in Open Education in May 1974 

2 Frank Smith Psycholmguistics and Reading, 1974, and Understanding Reading, 1973 (New York 
Holt, Rinehart & Winston) Two superb book^ describing what is — "musts," fot those who take this 
issue seriously 4 ' T 

*This definition was proposed by one of the working groups -SHhe Georgetown IRA Conference, and 
primarily was pressed by L Gleitman (University of Pennsylvania) and T Sticht (Human Resources Re- 
search Organization) It also appears in Herbert K6hl's recent book, Reading — How To [New York E P 
Duttpn, 1973) ^ 

4_:> Frank Smith, op c/1 * 1 • 

b Virginia Allen, Triple "T ' Project Monograph Series #1, "What Does a Reading Test Test'" (Phila- 
delphia Temple Univers.ty College of Education, 1974), pp 1-21 

7 Which is not to say that technically the tests are not often carefully and conscientiously, even 
painstakingly, well put together' The criticism is largely of tr\e rationale, not the skill with which this 
rationale of testing is, pursued 
8 Report by.Anne Bussis of Educational Testing Service, "/Memo For the Record," Dec 1973 
9 See excellent summary of this issue, "Consumer Awarene/s in Te*st Reviews," by Roger Farr & Wm 
Ellery (Washington, DC Linguistics Institute, Georgetown liniversity, Aug 1973), mimeo 

10 D»agnpstic Reading Scales, devised by* George D Spathe (published by Calif Test Bureafu of 
McGraw-Hili, 1963) Also see Vivienne Garfinkle's suide toHbis material (published by Project Follow 
'Through, Bank St College of Education, New Yprk, Feb 1, w3l \ 
11 Nicholas Silvaroli, Classroom Reading Inventory, 2d Ed (DubuqueSA^^C Brown, 1973) 
12 Yetta Goodman & Carolyn Burke, Reading- Mjscue Inventory Procedure torl>agQosis and Eval- 
uation (New York Macmillan, 1972) ^ * ^ — 

n The Workshop Center for Open Education, City College ot'New'York, (6 Shepard Hall^ Convent 
Ave & 140th St , New York City 10031) has published various brocfajrfcS'on reading profiles and otheF 
informal e&luation tools See ajso studies by Patricia Carini of The Prospect School, North Benning- 
ton, VT-^^:, "A Methodology for Evaluating Innovative Programs," June 1969 , r . 

14 Note this remarkable rationale for the English system, published in a Department of Education 
pamphjet (No SO) in 1966 Perhaps the strongest arguments in favour of the present test are first that it 
implies a definition ot reading ability that is in accordance with common sense, and Secondly that it 
takes no more than tervminutes of the pupil's time This economy is valuable, st^ice the business of 
the school is not to test but to .teach Moreover it ts not ten* minutes of every pupil's time thatos 
needed " " - 

1£ *A number of provocative essays on the history of normative-type tests raise important questions 
■ egarding the larger social function of looking at children in this particular way See, (or x example, 
Testing for Order and Control,' by Clarence J Karier, Educational Theory, Spring 1972, ,pp- 159-80 
16 See Reading Failure and the Tests," by Deborah Meier, City College Workshop Center, An 
Occasional Paper, Feb 1973 See also What's Wrong with the Tests' by Deborah Meier, Njptes from 



City College Advisory Servicp^to Open .Corndors, Mar 1972, pp. 3-' 

38 



i 
i 



The Stranglehold 
of Norms on the 
individual Child 

Lois Ba'rclay Murphy 



Our children'are choking in a stranglehold of norms. What do I 'mean by 
''stranglehold"? I am talking about the stifling, asphyxiating, smothering effect that 
comes from pigeonholing children m.terjms of test scores, of normative c ategori es 
of pathology and nonconformity to social demands. The breath 1of learning 
requires oxygen for mental growth and respect for the integrity of the child's 
individual psychic metabolism as well a$ his physical idiosyncrasies. Confinement 
to a narrow, tight, constrained mental and emotional environment limited by 
statistically based norms and unreabstically, restricted expectations can starve as 
well as frustrate the child, just as the Berkeley i at cages interfered with optimal j 
development of brain tissue and problem-solving in little rats (Rosenzweig, 19^2). 
. . The strangling and starvation go on in schools, hospitals, clinics, families — 
wherever we freeze our expectations of a child in terms of*the belief that tests are 
*tfte truth, the whole truth and the final permanent truth about a child's potential!- , 
ties Judgments are too often weighted t>n the negative side. Instead of asking* g 
about how sick, or bad, or problem-ridden a child is, we cou7d ask. "How is, he w /• 
dealing with the complex iife situation he is in, with its remotenes£from his style, * * 
his longings,, his raalities, his needs, his particular balance or imbalance o{ 
strengths and vulnerabilities?" We*co/j/d ask "How can we support his integrity, 
build on his resources,- help him in ways he wants and is .ready to be helped, 
wherever we find him?" v ^ 

How and When To Look at Tests 

Some wise teachers say "I don't look at the tests until L've had 'time to get 
acquainted with the child and discover*what he jesponds-to, what skills he has, 
what he 15 interested in, what sis tempo is, what he is hiding and what he lets us see 
(I usually need a month or six weeks for that). Then I look at the test results in * 
relation to the way he is functioning Sometimes he does better than the tests 
would lead one to expect, sometimes not so'werll. If the latter, Lean explore 
different avenues of reaching him, to bring out his better level, and I try to find out 
what is hurting, or why he cannot brmgto the group what he was ^ble to bring in 
the test situation If he does better in the classroom than he did on the tests, I try to * 
find out what bugged him in the tests, or what they did not give him a chance'to 

show " . \ y 1 

Tests can enrich the teacher's insight or the therapist's understanding only when * 
the details are lopked at in relation to the child's experience in the test situation — 
what he was coping with and how. - " ------ 

Speed norms can be especially misleading, as we know from studies if American 
Indian children (Klineberg, 1928) and from many children to whom strangeness- in 
a situation slows them down as they try to grasp what is going on. t . " 

Age norms have been useful in gross distinctions between severely retarded and 
A * »♦ 

Aci.mtV'd from d presentation to the American Orthopsychiatry Association at its 1972 Annual Conference in fcetruit, 
w ijjan hrst published in Childhood Education, April 1973, pp 343-49> 



adequately endowed children and, more* broadly still, between capacities of 
children and adults But they ha% led to rigid assumptions regarding age- 
appropriate behavior and uri/ea//^c pressures on children to behave like a twelve- 
yearrold or a fourteen-year-^ld, as I shalj document later, or even at younger age 
levels to "be a big boy " Pressures to meet arbitrary sex-role standards especially 
ignorejhe wide variety of growmg-up patterns exhibited by children jn Jongitudinal 
studies Mental health norms and concepts of problem behavior are often incon- 
sistent with What we know a of the vicissitudes accompanying the process 6f 
growmgup,, the long struggle with remarkably unique patterns of vulnerability and 
strength, arid the very important "Toynbee effect/' t^e response to challenges 
evoked by the confrontation of. one's specific checkerboard of weaknesses and 
resources with the environmental'cbeckerboard of stresses and supports. To a large 
extent, each child's development is'^*mystery story whose outcome we cannot 
really predict The complexity of the developmental process with the emerging 
capacities, drives, investments, conflicts is still far beyond our complete 
comprehension, at our present primitive sta£e of understanding 

» 

, Some Key Sources on Human.Development 

The most important volume on human development to appear to date became 
available in time for me tc^use some of the findings to document this thesis I refer 
to The Course of Human Development (1971) by Jones, Bayley; Macfarlane, and 
Honzik, of the Institute of Human Development at the Unfversity of California. 
This book ^contains" edited versions of over sixty papers on physical, mental, 
emotional and social development of the children, now adults, in three major 
longitudinal studies at Berkeley begun in the 1920s and continued into the present. 
It is not only a gold mine but the major gold mine of solid findings on 
development, with implications everyone working responsibly with children must 
take into account 

Here I can share witfyyou only a few highlights, which I be(ieve should be take/i« 
most seriously and-which, if we do, should challenge and turn around some of our 
' P^tj/gid assurgptiohs about behavior and development. 

Physical Development a/id Behavior 0 

I Let us start With the % reports about relations between physical development and 
I behavior, some of which should be familiar but generally are not, from the volume 
[ by Stolz and Stolz (1951), the paper>y M. C Jones and N 3ayley (1950) and by 
H E. Jones (1971). These related articles document vyays the 20 percent of girls who 
mature most early get into difficulties as a result their social an^ sexual drives. are' 
precocious in relation to their intellectual development, they may attract older 
boys, be full of adolescent fantasies too early and be out of harness'with their 
slower more-average peers Since boys generally Mature more slowly than girls, the 
slowest 20 percent of boys are also out on a limb, Left to feel inadequate with both 
boys and girls, isolated, rejected, and of course. they develop^ some f6rm of coping 
and defense-techniques to deal with their situation. Similar dilemmas are 
experienced by children'wfio may not be in the extreme 20 percent, but whose 
growth pattern is variable. '* 

In other wordsjt behooves us to watch closely to s§e exactly 'what the develop 7 
. me'ntal situation and problem are for the child before we'complajn that he or she is 
'not "acting appropriately,,^ his age/' How canlhey fit into a norm that is in- 
appropriate to their individual patterns of developnfcnt? Physical measures such as 
height were much more predictive over a long age-span than , mental test measures, 



ui the \oppurt utrtht ihe^> Di Murphy present) heretn e&fiies trom longitudinal ot >tudtv* ot normal children, 
fhittjtvd^t th* Xlennmtfer hmndattun in lopeka k.imn:* bv S F'iscd/und dnd M LotKh-m 1^48, wislc.arrted further under 

n </(ft*t lt\.ti£ kmtii i l H> 1 ) <i* twit froni herpriur research <j( Sdfdn Ldkuence College Support ivtfs contributed for * 
. tnCTujj<4j itm/it * from th« f'feiffcv «m«i V'u/wmy Ivundafftoi* dnd trom the \dtional Institute otjMentdl Health, dnd tor 
the Utter studies fav km"!) from frV /os/un Mdc y Fp&hdtitton $ Jt 

- i tr* 

while these were better than personality measures Correlations fd£height between 
ages three and eighteen years were in the 70s, whereas mental-test measures were 
around 40 But few personality measures reached .40 It is worthwhile to look 
closely at IQs Macfarlane (1971) observes that for the eight t£si$ given between 
ages six and eighteen, only 15 percent of the children showed a range of less than 
TO iQpomts, 58 percent showed a change of 15 IQ points or m6re. One-third of the 
children showed a range of 20 IQ points or more Ten percent showed a range of 30 
points or more These results are in line with our Topeka findings (Monarty) 1966) 
and, in addition, dataJrQm the Fels Research Institute (Sont^g, Baker and Nelson, . 
1958),*as well as other studies'such as Nancy BayleyS analyses of sequences in IQ 
(19$). . • • 

Misconceptions About Mental Development 

Macfarlane concludes that "little reliance can be placed on one test." Beyond 
thi^, >he~notes the striking finding that a number of men with poor records both on 
mental tests and school grades, right througfi^high schoql, came as adults into '. 
positions requiring creativity and high intelligence. For example, one man had an 
average IQ through his developmental years around 100, he was held over three 
times in elementary scFTool, and finally^graduated from high school at age 
twenty-cne without college recdmVnendations He left tfte community,, made up 
his high schodl deficiencies and now >s a highly talented architect. Currently he is 
living out a "normal life through his children^ being active in his community,, and 
finding life excitmg'and satisfying after thinking of himself as "a listljess oddball" 
during childhood and adolescence 

Social Development >• * " 

Macfarlane also tells u^tha't, of the children studie^ by a large research staff 
with different theoretical biases, close to 50 percent turned out to be more stable 
and effective adults than any staft Vnember had predicted, 20 percent were less? 
substantial than predicted, ^nd sc.arcely one-third turned.qut as predicted 

Among the 10 percent who turned out far better than -predicted were two who 
presistently spent their energies in defiance ot regulations, getting marginal or 
failing grades, .throughout their schooling, and finally getting expelled at ages 
fifteen and sixteen. Macfarlane Jinds^ both of them tcJ be wise, understanding 
pararits now, who appreciate'the complexities of life, moreover, they are** 
humorous and compassionate. 

In reflecting on the factors contributing to erroneous predictions, Macfarlane 
remarks that no one bedomes mature without the pains and confusions of maturing 
experiences. Even experiences that looked traumatic at the time are nbw regarded 
by subjects as forcing them to come ta terms with what they wanted and did not 
r >vant out of their lives, 'and to shift their behavior in the "direction of goals they 
tlanfied. Also, many times, behavior consideVed 'unpromising by clinical 
examiners, such asove^dependence, was converted into nurturance by adulthood 
and not, overprotection — since these people^wanted their children to avoid the 
, overdependence they themselves had experienced. There wete also "late 
bloomers" who blossomed only^after they got away from tbeir families'and were 
*' * released to be themselves. Macfarlane emphasizes the capacity of theSe young 
people to drop early habits and behavior that got in their way as adults, ancHo 
develop new patterns on a trial-and-error ba$is. She also emphasizes the tendency 
of a cUnic^al staff to overweight pathogenic aspe<;tsx)f behavfof seen in childhood 
O to give too little weight to the. maturity-inducing aspects. 

ERIC *41 



Some Basic Findings on Individuality * 

I have begun with the most familiar areas for clinical and educational workers. 



Let me go on to some very fundamental findings from the mfdjcal, physiological 
and^iochemical areas Anothernmportant volume for clinical w^kers must be The 
Biology of Human Variation. Here Dr Sontag (1966) of the Fels kesearch Institute 
reports evidence that the heart rlite of the fetus, influenced as it may be by the 
fetal environment, tends to persist to adulthood. And we are beginning to have 
data on the relation of behavior to heart functioning and other autonomic 
reactivity patterns Cranted that trie autonomic nervous system* interacts with the 
central nervoys system and is influenced by cognitive functioning — fts^iofeed- 
, back studies are showing (HefferlinS and Bruno, 1971) — the influence from 
autonomic functioning to behavior urgently demands attention. In fact, it may 
well be that we will understand more about variability in IQ when we study the 
relation of these variations to aut&nomic reactivity. 

Still another basic contribution to 6ur thinking is that of Roger Williams in his 
book on £?/ochem/ca/ Individuality (1956). Here he documents the extraordinary 
variations in individual structure and needs — from sizes and shapes of stonfiach 
and intestines to the most extreme differences in needs for each 6i the different 
vitamins Such findings simply *add another dimension to the extensive and solid 
data on individuality of growth patterns from infancy to adulthood (Shirley, 1931, 
Olson) 1943;*and*others) 

Misuse of Statistics ^ 

, We » really don't need any more support for the necessity of recognizing 
individuality at every level but, just to cap the climax, I will mentionihe comments 
of one of the world's foremost statisticians, C. Radhakrishma Rao (1965), to the 
effect that much of oqr statistical thinking is unsound when we draw conclusions 
from a vetage scores in large groups, while ignoring the scores of couaterbalancing 
subgroups that contribute to the averages. I emphasize this point because 1 the 
averages are used as a^basis for norms — .norms which we have already v seen are 
misleading, as in the data on variations in ages of maturation at puberty. 

Rethinking Frozen Concepts r ^ s <5 

We have been talking about individuality , # plasticity and the capacity ;for .self- 
directed change. Macfarlane also emphasized the trap of pathdlfcgy^riented 
thinking-. Surely the data amassed over the last fifty years demand that wef^reth ink 
our frozen concepts, loosen up, confront the realities of child development and 
\come up with some better, more reahstic if less quick and easy concept^. Reliance 
on trie IQ has stultified our thinking about potentialities of children. Rdian.ce on ; 
pathology-dominated concepts of drives, has distorted our thinking, even poliiited . 

it - ^ * . / \ . v t- V 

What are the alternatives, possible ways of thinking about children that m\ght.fc>e v % 
more fair to the. child and his potential development? \* Vi 4 .? 

Using Data Qualitatively o * » Y^AV\ 

First, we need to recognize that IQ tests, personality tests, and the rest 4^^- \ 
limited We can |is^hat they tell us about wftat the child^dpes at this momerft^r.V 
under these con^ning, districting, uncomfortable, frightening, boring, uninspiring!^^ 
conditions They.don't tell us what the child cloes under other conditions, or what 
he might do if comfortable^stimulated or inspired, healthier, or less bogged down * \> 
in family anxieties As4}pe.1ittle boy in the Topeka studies implied, they do not \ 

• 42. , ^ 



• / - /'. : 

\ . , 

) ■ ' ' ' ' : 

even necessarily tell what tfie child can do right p,qw. He asked, "Why don't you 
ask me to do what I can do?" ' ?' , * • • 

Along with this, we. needttf recognise' that the specifics of the test may be much 
more illuminating than the IQ f which' k the avefage-of all the functions. One little 
girl who barely passed seventlvyear-level tests orY routine items passed twelfth- 
year-level tests involvirigjoslghf arjd comprehension. Although she was retarded in 
reading at that time, she^ has income an expressive and original writer and a 
^remarkably intuitive and treatitejmottfer She was considered a "slow learner" in 
^|he second grade, while all the time she was storing away observations and 
reflections in her independent sensitive way. I could give other examples, from our 
Topeka studies (Murphy and Moriarty): a girl with an early l(J tif 100, who was 
considered in college the most outstanding candidate for a music scholarship, and 
is now having her graduate year of practice teaching in preparation for a career as 
a music teacher In her case, the incidental observations of her social awareness 
could have provided a better basis for prediction of later development. 

< - ' „. * 

Understanding Motivation — Some Exaniptes * 

Along with the qualitative use of data from tests of all sorts and the observations 
th^it can be made during, before and after, tests,^ we nee& a drastically new 
approach to motivations and drjv^s. Perhaps we will be on more solid groi^d if we 
ask. "What is this child's situation, thepositives and negatives (roadblocks, frustra- 
tions, etc.) for him; what can wk leafn about the positives and negatiyes of (he 
equipment he brings to dealing with the situation — the areas of strength and of 
vulnerability, and in terms of his plus and minus resources what is he trying to do 
» with his situation?" In Juvenile ;Court one Monday morning I saw a ghetto boy 
brought in for picking up ?ome discarded metal tuBMg in a construction area; I 
don't know just what he was going to do with it, but .nobne asked him. Here is a 
boy for whom the city provided no play space, nothing to explore or to create with. 
He finds something he might use* Bravo! A boy with some initiative, some active 
drive to pick up a crumb of possible valuejn the arid ghetto where he lives. Let us 
get him into a shop for metal- and wood-working where he can try out his ideas 
instead of sending him to Detention where he'll learn . fascinating criminal 
techniques. - 

Here is another six-year-old boy with an extremely disturbed mother. I happened 
^ to be nearby when I heard him yell to a^friend, "Bill, let's get the hell out of*here, 
Mom's starting to go on a rampage." His drive to survival was being expressed in 
utterly heafthy and sensible escape. At school his teacher reported that he didn't 
seem to trust any adults and was not "learning." Of course, he had learned some 
very basic things, and was using his learning weJI in terms of what he had 
^ experienced of life so far. ■ t 

Another six-year-old boy who rejected a very" rigid teacftprtras placet! in a 
different school and reported, "This new teacher understands children much better 
than that other teacher." He was correct — just one instance of how important it is 
to listen to the observations, judgments and points of view of children. 

The examples I have given illustrate the child's^ integrity as an autonomous- 
growing, person, appraising his environment, finding ways to survive in' it, 
developing whatever coping methods anjd defenses he can device to get along with 
thebusiness of growing up and g&t along irr the situation in^which he finds himself. 
.Tfre extensive documentation of transitoriness of fears (Jersild, 1935) Behavior 
. problems (Macfarjene, Allen and Honzik, 1971) and even changes in -body-build 
^♦olzand Stolz, 1951). attest to the exterj^of the child's plasticity and capacity for 



ERIC 



s 



T 



change, and for progress in mastery and outgrowing earlier patterns he does not 
need any longer Topeka (Murphy and Monarty) and Berkeley (Macfarlane, 
"Perspectives ,"' 1971) studies document the positive outcomes that can emerge 
from the child's mastery of his vulnerability and the stresses he successively fades 

Discovering Coping Strategies ' " 

The obvious conclusion is that we need to focus on and better understand the 
,nature*of ongoing current coping struggles/ how to support^ hem, how to help the 
child to extract the strength and insight that successive experiences may make 
available to him. We need to understand the positive strategic values of 
withdrawal in certain situations, and be very cautious about talking about a "with- 
drawn child " Similarly we need to respect and value children's protests, 
resistances, attempts to\change or control situations, and all the other active 
coping efforts that can givW us cues to what the childf ihds intolerable, unsuitable, 
boring, distasteful or threatening to his integrity I ar%not offering a nevt scale or a 
new test to freeze your thinking once again. I am ph^dmg instead that each 
clinician, each teacher, use all of the available resources along with his own fresh 
look at the child in his situation in order to discover the meaning of the child's 
'behavior from the child's own point of view. 



References , * 

Bay lev. N "Consistcncvand Variability in the Growth of Intelligence from Birth to Eighteen Years " 

journal of Genetic Psychology 75 (1949) 165-96 
Hefferhne, R F , & Louis J J Bruno Bio-Feedback Controlling the Uncontrollable "In The Psychology 

of Pn\ate Events, Ralph Jacobs & Lewis B Sachs, eds New York Academic Press, 1971 
Honzik, M Perspectives on the Longitudinal Studies " in The Course of Human Development, %\ C 

Jones, N Bayley, J Macfarlane, & M Honzik Waltham, MA Xerox College Publishing, 1971 f 
Jersild, A T , & F B Holmes Children's Fears Child Development Monograph No 20 358, 1935 
Jones, H E "Physical Maturing Among Girls as Related to (Behavior" In The Course of Human" 

Development,!^ C 'Jones, N Bayley, J Macfarlane, & M 'Honzik Waltham, MA Xerox College 

Publishing, 1971 

Jones, M C,&N Bayley Physical Maturing Among Boys as Related to Behavior " journal of Educa- 
tional Psychology 41 (1950) 129^8 
Mineberg, O "Racial Differences in Speed and Accuracy" journal of Association and Social 

Psychology 22 (192%) 273-77 
Macfarlane, J "From Infancy to Adulthood " In The Course of Human Development, pp 406-9 

Perspectives on Personality Consistency and Change from the Guidance Study " In The 

Course of Human Development, pp 410-15 * 

The Impact of Early and Late Maturation in Boys and Girls lustrations from Life Records of 



Individuals " In The Course of Human Development,, pp, 426-32 
Monarty, A Constancy and IQ Change. Springfield, iL Thomas, 1966- V 

Murphy, L , & A Monarty Vulnerability, Coping and Development Yale University Press, in press for 
1975 publication 

Olson, W C , & B O Hughes Growth of the Child as a Whole " IrtXhild Behavior and Development, 

R A Parker er a/, eds , pp 199-208 New York McGraw-Hill, 1943 
jldu, C R Perspectives on the Conference " In Classification in Psychiafry and Psychopathology Pro- 
•r cepdings of a Conference in Washington, DC, November 1965 IAS Dept of HEW, P H S , p 560 
f Chevy Chase, MD National Institute of Mental Health 

Rosen/weig, M R , Edward I Bennett, & M C Diamond "Brain Changes in, Response to Experience " 
♦ ScieAkfic American 22b, 2 (1972): 22-29 : 
ShirleyfM M The First Two Years, 3 vols Minneapolis University of Minnesota Press, 1931 
Sontag, L W Implications of FetalBehavior and Environment for Adult Personalities " \fi Biofbgy of 

Human Variation, E M vVeyer, Editor-in-Chief Annals of the New York Academy of Sciences 134 

(1966). Art 2 

Soritag, L W C T Baker, &* V L Nelson Mental Growth and Personality Developments^ 
Longitudinal Study Monographs of The Society for Research in Child Development 23 [2, Whole 
No 68), 1958 * 
Stol/, H R , & L M Stolz Somaf c Development of Adolescent Boys New York Macmillan, 1951 
Williams, Roger Biochemical Individuality Austin University of Texas Press, Reprint 1969 (original 
edition John Wiley &'Sons, New York, 1956). 

wr '. '44. 



Rartlll 

Some Examples of 
Meaningful 

Evaluation - 



The Prospect 

School: Taking 
Account of 
Process 



43. 



Patricia F. Carini * 

• - * " 

In the past, many schools have recorded only achieved outcomes* Thus, 
'^teachers and parents have typically accepted 'isolated , end .by-products of the 
"learning process as representing learners' knowledge. |'f a child could state an 
answer to aft-arithirnetic problem, the process by which he reached the solution — 
whetber by guessing, counting on the fingers, or engaging in logical derivation — 
was neither discussed nor recorded. 1 

To the extent that process was given consideration, as in theinstance qf asking a 
child to correct a wrong answer hy correcting the process he used to reach th*t 
answer, pTo^ess itself was construed to be a correct procedure. Therefore, as jn the 
assessment and recording of end products, process too has usually been measured ' 
in terms of correctness rather than'in terms of productivity, flexibility, meaning- 
fulness or spontaneity. 

For children in classrooms where their activity is minimal except for verbal 
responses, such as assessment is virtually all that is available to a teacher. And, in 
any event, end products are most readily available for measurement. 

When classrooms are structured as environments that invite direct, active 
involvement in exploring concrete material, however, the processes underlying the 
child's organization of the world around him become mors apparent to the ' 



^j" tinted from Childhood Education, April 1973, pp 350-5i 

ERIC ". 



Us 



insightful teacher With this awareness, the teacher may develop reluctance to 
maintain the,dld forms of achievement-oriented record-keeping and testing. 

A crucial difference in recordjng and documenting process rather than 
achievement is that the former must be observedover time to determine a^pattern, 
a matrix of descriptions of the learner's involvement. Using measures of 
end-achievements, for assessment purposes, on the other hand, assumes that . 
teaming can b^tecorded and assessed as isolated erements independently of the 
'meaning f.qr i{*ejearner. . 

Speakifig^f the ongoingness of the learhmg situation,* the importance of 
cpntmuit^otift Dewey (1938, p. 38) said ' -—-v 

ile\g&ience arouses curiosity, strengthens initiativeS^Lsets up desires and- 
purposes frfaf are sufficiently intense to carry a person overdid places in the 
future,, continuity works jn a very .different w5y. Every experience is a moving 
force Its value can be fudged only on the ground of what it moves fovvard and into " 
, */t is then the business of the educator to* see /n what direction an experience is 
heading. . . Failur\to take the moving force qfan experience into account so as to 
judge and direct it on the ground of'vbhat it is moving into means disloyalty to the 
principle of experience itself. 

And Alfred North Whitehead j$BV), p. 25), who also grasped thoroughly the 
organic nature of the complex w^xall ''schoij/' counseled that tt^assess the School 
w£ should not test the achievement of £t£ students but sample the- program 
accord mg to its- stated goals and philosophy* "% 

Primarily it is the schools and nofthe scholars which should be inspected:^ach 
school shohld grant its own leaving certificates,, based on its own curriculum^Jhe . 
standards of these schools should be sampled and corrected. But the first reqyfc/te 
for instituting educational reform is the school as a unit, with its approved J*&$ 
dculum based on'its own-needs, and .evolved by its own staff. 
' lr>any human process we only see, truly encounter, what is ^vail^ble to us from 
our own point of view. We must first of all acknowledge the relativity*ofthe event 
— be it the learning process, knowledge itself, or t'bfe school — and the subjectivity 
ofj<for own assessment. * ^ * 

The "objectrve'' measure, such as a score or computeVized datum,, always \s 
'rooted in someone's'point of view about what is worthy of validation. If all we can 
articulate from our point of view about a cbilcPs experience with!arithrfietic is the 
correctness bi his information, then we carr respond to and record only that aspect 
of his experience. - i * v? T ~~ - 

If we perceive the»rjiost important skill in reading to be word-recogaition, we will 
find an "independent measyre"*of word-recognition and -by. applying it* we will 
validate not the reading process but the single dimension that (unaQknowledgedly) 
we deem tp+iave greatest significance But by being willing to fofego certainty and , 
by accepting our point of view^s part of the datum, we can Jiccrue descriptive ' 
patterns that will permit the stable characteristics, the stable themes of an event to 
emerge. , 

- * «. 
The Prospect School, as a demonstration school for the state of Vermont, has 
had to confront the necessity for record-keeping, documentation. and evaluation 
since it was founded in 1965 Beginning with an original group of tw6nty-f ive five- 
and six : year-olds frorri widely varied economic backgrounds, the school has 
evolved to include approximately ninety-five chjldren, grades K-9. The population 
continues to reflect diverse economic backgrounds, Staff presently includes five 
teachers, and it is their records that provide the basic data for an ongoing research 
and documentation program. The design for the record-keeping, the research and 
the documentation,, and the collation of the longitudinal data haye'been largely 



the responsibility of a small research staff (originally one person and, since 1971, 
y three persons) serving in an adjunct relationship to the school. 

What those of us working at The Prospect School have done within our program 
is to construe our record-keeping as a consciously temporal and subjective process 
In practice, we consciously examine andxecbrd processes e g social 
development, expressiveness, reading — descriptively so that any given process is 
available for interpretation over time according to the way it contributes t9 the 
child's total development or to the evolution of the learning environment. That is, , 
the availability of descriptive records provides the basis for an ongoing 
f examination apd interpretation from a variety of points of view of such diverse 
- processes as the physical and intellectual development of the individual child, the 
patterns in learning to read among a group of children, the r^tatforiship of early 
arithmetic skill to social development, the contribution of the individual's interests 
to the evolution of the total 1 curriculum, etc. White the primary objective oi these 
records is to contribute to the continuity^ of. the individual child's" .learning 
♦ ' experience, their secondary objective is tQ provide a documentary account of the 
evolving school program Finally, from the tfata and msrght accrued through Yhe 
records and documentation we have also designed instruments (Carim, Blake, 
Carini, 1969) to make independent and longitudinal .assessments of underlying 
processes in children's problem-solving and fhinkmg Through thjs instrumentation 
we hope to learn more of children's spontaneous formulations of their experience, 
to better enable us to provide a learning .environment $heX will support "their 
continuing growth. , 

The basic records (Carini and Carter, 1971, Carter and*Canm, 1972) we have kept 
to provide a continuing description of each child and to provide 4 documentation 

* of>program, include the following: '* m ,.j' ' ^ % 

—Children's work, e g , drawings, photos, etc 

— Children's journals 1 / s a 

— Children's notebooks or written work * 
—Te^tH hers' week |y records I 

* —Teachers' repqcts to parents s # . - * 

- -Teachers' Assess rVTBQt^of children's work in math, reading, activities ^ ^ *' 

—Curriculum trees * ' 

-Sonograms 



How these records contribute to the understanding, of each child's involvement 
in the learning process and to the documentation of the total program can be 
gauged from the following excerpts from the>£cqi^sj<ept ofjthe involvement of jt. 
group of older children 'in an ongoing project (Carter & Carini, 1972): . - ' 

Teacher's Records of Group Involvement in tlje i Merck forest Project [Excerpts! 

September 24 — Today was beautiful We took the group to the Merck Forest — Chris drove David 
Sobel's van, beautiful weather and the kids seemed toenjoy it — only Heidi mentioned that she didn't 
like the trips too much but after lunCh^he played her fecorffec with some of the other girls and seemed 
happier ^ ^' ^ / ""^C 

Hugh (Putnam) introduced the foresra.na' we broke up ( 

Chws^took Morris, Alec, tfed and Pef-jjp to /ytount Antone 
— JJwghTed a group for the grouse expedition and wildlife — Elizabeth, Louise, Penny, Dru, Anna, Karl, 

jacob and me . . ;t ^ , 

-Charlie had a group learning a&out tree>^DTjjly and Pnscilla 

— Only Hffid* 1 didn't choose an acjivjty \ % , 

After lunch — which everyone thought for' themselves^ except, Morris and Jacobfor^ot) — David 
read some of The bvmg Forest'. I 



/ 



ERIC 



L| " bi«, form of]re(ord-k#eping applies generally only \o children^ed^tevon or older 

\ 7, 



c 

October 28 — I went to the Merck with seven students — NeorAlec, Karl, Louise, Morrrs, Elizabeth, 
Per Chris aKo drove — jt was a warm sunny .day - , /"•♦. 

— Per and Ked hiked with Chris to the hunting lodge . * ' * : „ 

— LouisXMorris and I helped blaze a trail 

— Karl ana Alec did surveying <* 

After lunlh Louise and I made tea from yellow birch and mint We got back late but all went well. 

November 12 — Went to Merck Forest — lots. (four inches) of snow Looked for and discussed 
animal traqks in the snow Hugh "came (Chris went with us), Karl, Emily/louise, Ned and Alec 

December 2 — Chris and I took a group of nine to the Merck Forest — three feet of snow with bnght 4 
sun ^/ve hiked to the lodge oh snow shoes and cross-country skn* —.all seemed to enjoy it despite 
cold and exhaustion 
etc 

Teacher's Records on an Individual Child's Study of Mushrooms [Excerpts] 

September 20 - At the MercR Forest Elizabeth got really interested in mushrooms — wants to study 
them next time out 

September 27 — (Elizabeth) — The mushroom project has really captured her interest She has just 
been up to her armpits in work since we came back She and Anna and Dru made some really beautiful 
posters explaining what they had found. * 4 

Elizabeth's Identification of Mushroom [Excerpts from Her Notebook] 

1 Crowing in rich soil 2Vi inches high, sort of shady spot Thick stem about 1 mch in diameter it is, 
while gills are white getting down toward the edge Also curves inward making cup shade, is about V/i 
to 3 inches across the cap The color gets darker as it cups in, feels hard, the stem is white. When 
btoken open, it is (or we think it is) a tricholoma They grow next to pine trees mostly TmYooe.was 
growing with birches and maples Animals may like eating this kind of mushroom 

2 2 J 4 inches tall, the cap is 3 inches across It is a brown color, is very roundy The top flat part is 
darker than the rest The gills are white, the stem is pinkish white UNIDENTIFIED " 

Children's Observations on the Merck Forest [Excerpts from Journals] 

Karl — no date — There is a stream on a mountain It's far Jhere is a place where I vyent that has 
a pool deep and clear There are woods all around f\ 

Dru — October 7 -A Went to Merck Fores^ went to identify mushrooms 
Hike & Hike & Hike I f Death caps & 

finally at ^-^V ^ as teromycetes 

thje lean-to JllfmHUiil A are in store for ^ 

1 ^ us today 

Penny — December 2 —.Most of us went to the Merck today We had to go a different way today 
because the other was not plowed When we got there it was a very different place That's what I had 
thought, it was a big hill of white/and it had pine and spruce trees all over, and they had snow 
weighting down their branches The whole view was so beautiful There were some things in Mary's car 
that were Dave's that I wore They were like boots in one way and slippers in th^tiaer and they were 
very warm It took us a long time to get up to the hunting lodge. When we got^Btwe only 'stayed 
long enough to eat lunch and then we had to go back to the parking lot But we hacHnot of fun On the 
way back I wore snowshoes 

Karl — March 21 Last Wednesday, Thursday, and Friday we went to the Merck Forest on a three- 
night overnight I don't know what to say about it I went out with Per and Jacob They were throwing 
snow-balls at me Today I wrote a letter to the National Outdoor Leadership School about going on 
one of their expeditions If I got to go I would be able to hike and fisn and all kinds of good things for 
. five week's And this afternoon I don't know what I will do but I hope I can keaffBT^y reading or writing 
or something I also hope I can write in my journal tomorrow. / \ 

Implications * 

The final record reported can serve io identify a comprehensive curriculum 
available to the total group at the Prospect- School in math (surveying, maps, 
shopping), natural science (mushroom research, grouSe project, animaL 
identification, etc ), and physical education (trail construction, snott-shoeirtg, 
skiing, etc.). Even from these bri^^ccerpts, the "pattern of interests and 




I- 



Schematization oi the Curriculum Evolved Through, tffe ¥ Merck Forest t Project; 1 
ken from the Teacher's Records] '"' ?v ' j * \p,\ t / 



Merck, Forest 



C9fln>as$ work 
MapsttitJy 
Surveying ' , 
•>Mar>-makTng A 2 



. * 

Hiking 

- Wood games 
SMing. 

Snow-shoeing 



^Photography * V '».. \: 

• v :Tree j3entiffciation v ' " ; . 
.< Witdfld^^ p? 
'<Mushfbbro^urtffng v -**. 
Research on' mushrooms 
/Sample"coHection f - . 



, ^ ^Crp^proiect^^fc 




Ecqlogy;disc^ 



Tracking «- 



Animal identification w , 



** ^ Winter camping 



involvements for individual children is provocative. Does Elizabeth's precision in 
classification of Ihe mushroom reveal an absorption in close observation, an 
interest in mushrooms themselves, or perhaps both? Can this interest be continued, 
givenim inner connectedness, through close work with microscopes or more work 
with identification? In fact, the mterestpd extend to microscopic study, and if ojje 
were to look back upon many years of records of Elizabeth's involvements one 
would find .a balance struck between the precise and mathematical on the one 
hand and the fantastic on the other — the latter expressed through reading* 
patterns, writing and productions in clay and papier-mache 

Thus,, each project reported in the comprehensive record reflects the specific 
interests and involvements of individuals like Elizabeth. That Karl, a child new to v 
the group, both writes "There is a stream on a mountain. It's far . There is a place 
where I went that has a pool c^ep and clear . . ", bla/es trails, writes to the 
National Leadership School, and acknowledges his uncertainty as a member of the 
group in his tangential relationship to Jacob and Fer gives depth and meaning to 
Dewey's. (1938, p 38) statement that 

fvery experience is a moving force. It can be judged only on the ground of what 
it moves toward and intb\ . . It is then the business of the educator to see in what 
direction an experience is heading . . . Failure to direct it on the ground of what it is 
moving into means disloyalty to the principle of experience itself.' 

In what ways should Karl: be supported? What will extend and deepen his 
interests without distorting them? One function of records is to provide insight to 
teachers that enables them to promote continuity of the learner's experience 
Documentation in comprehensive fprm of that continuing interest and 
involvement, in turn, constitutes a description of the program so that, in 
-Whitehead's words, "the standards of the school can /be sampled and corrected " 

Records like those above, together with brief da'ily recording of skills (e.g., 
-reading and numbers)* have permitted us to document our curriculum over the 
past seven years, to report precisely to parents and others on/growth of individual 
/V u .ildren, to study patterns of relationship ^mong children, and to discover patterns 



47 



ERIC 



43 



of occurrences that reveal styles of learning; e.g., the bright, late-conserving boy 
who is slow to learn the technical skills in reading. 

From these close descriptive observations and recordings wfthin the classroom, 
we have also developed hypotheses and instrumentation for making longitudinal 
assessments of processes in language, thinking and problem-solving (1967, 1968). 
Our process of data-analysis over a four-year period, in combination yvith 
observations of children carrying out spontaneous activities, has resulted in 
analysis of the problem-solving tasks and resolutions of the tasks to form a scale 
(Carini, 1972). This scale, at present only partially complete, is the single most 
significant result of The Prospect School evaluation to date, as it has potential to 
assess the following dimensions in the chad's relationship to t^e world: a) what any 
given task Remands of the child; b) what the complexity and the availability of the 
perceptual or conceptual material are to the child; and c) what level of 
differentiation is reflected in the child's resolution of the tasks. This kind of 
assessment' to provide a definition of the limits and plasticity of a developmental 
stage is needed if, as Wohlwill (1968) points out, we are to specify a developmental 
timetable. ' 

Equally, this kind of assessment has potentiality for replacing tests of correctness 
\vith tasks that provide descriptive and diagnostic information about a given child's 
approach.to a problem. TnuS, a child might be asked to gi\7e as many ways as he 
can think Dt to make ten. By his formulation of the problem as limited, for 
example, to one operation (addition) and that operation carried out randomly (8 + 
2, 5 + 5, 3 + 7), we are informed of his capacity to resolve a task requiring the 
formulation of a conceptual framework independently of perceptual materials. 
Were we to reformulate the same task 'to provide the child with the operations c and 

the logical relationships (9+ 8+ , 7+ , 10— = 9, 10— = 2), 

we might anticipate a higher level of resolution, 

^ Thus both ourrecords and testing can be addressed to process and description. 
*ftnd if at times we also find reason to, record or test specific knowledge 
end-products, then* this isolated information can be embedded into and given & s 
• appropriate weight within the total mosaic of the person's learning experience. 
We are encouraged by our progress so far, but look forward to learning more 
with and from the children of The Prospect School. 




1 . 



References * , ■ • ** • 

Carini, Patricia F Outline of Research and Evaluation Design North Benjiingtoh, Vt, The Prospect 
Schoql, 1972 (mimeo) 

Carini, Patricia F , Joan B Blake? & Louis P Carini Progress Report IIL A Methodology for Evaluating 
Innovative Programs. North Bennington, Vt.. The Prospect School, 1969 (mimeo). 

Carini, Patricia F , & Jane Carter, Eds. Record Keeping. North BenmngtonJ/T. The Prospect School, 
1971 (mimeo) ' T — ■• ^ 

_ Carter, Jane, & Patricia F% Carini, eds .Documentation of the Middle School. North Bennington, VI? 
The Prospect School, 1972 (Xerox)r* * e 

JDewey, John Experience and Education. Kappa Delta Pi Lecture Series West Lafayette, IN. Kappa 
w Delta Pi, 1918 Used by permission of Kappa Delta Pi, an Honor Society in Education. 
Wohlwill, JoaCnim "Piaget's System as a Source of Empirical Research." In Logical Thinking in 

Children, I Siegel & F Hooper, eds New York- Holt, Rinehart & Winston, 1968. 4 
Whitehead, Alfred N. The Aims oLEjducation. New York: Mentor, 1929. 

• / % 50 



1 



Marcy 

ppen School: 
Feeding Back to 
Decision-Makers 



Ruth Anne Aldrich 

• 

The basic issue behind any form of evaluation is accountability. Participants in 
MarcyjOpen School of Minneapolis, Minnesota, have actively considered the, 
questions, "For what must a school be held accountable?" and "How can 
evaluation provide us with the information we need to develop an increasingly 
responsible program?" ^ 

Such questions, which have obvious importance for assessing the effectiyeness 
- of any educational venture, have particular significance for those of us working to 
develop informal, open education approaches. TheOp^n School at Marcy is a part" 
of Southeast Alternatives, a federally funded five-year projectjcurrently adminis- 
tered under'the National Institute of Education). This Experimental School Project 
of the Minneapolis Public School District seeks to provide comprehensive change 
in education^by offering a number of alternative school programs to children, 
parents 'and teachers. Marcy's open education program is one of the four elemen- 
tary alternatives frorri which parents and children can choose. It features flexibly 
„ curriculum, Scheduling and age grouping for up to 330 children, ages five to 
eleve\ with emphasis on helping the children learn Jo think and to make 
independent judgmerlts. 

" Evaluation 

Evaluation of the Southeast Alternative-Project is both external and internal. A 
program of summative evaluation is berng developed by a group known as the 
Minneapolis Evaluation Team (MET) wh ch reports directly to the National Insti- 
tute^ Education. In this article we wish to focus on the program of formative 
evaluation provided by an internal evaluation group. * 

For the past three years I have been wording as the internal evaluator of Marcy 
Open School. My rQle is to provide information to decision-mdkers that will help 
them improve program. Decision-makers may be individual classroom teachers 
who seek to identify and solve problems within their own classrooms, or th^ may 
be the staff as a whole or the Marcy Advisory Council—composed of parents, 
J* Teachers and administrators— who are responsible for decisions concerning pro- 
" gram and structure. I work with the individual orythe group involved, to identify the 
program. As I gather relevant information' it is given to the people involved to be 
^•^d as a tool toward making informed decisions. , . * 1; 

ERJC \ ' .51 



Accountability * 

A major task the staff and Marcy Advisory Council have requested is the 
evaluation of general program achievement of its goals and its accountatylity for 
those goals. 

To be accountable means to be held responsible for something over which one 
has control Schools have control over the goals they identify and over the environ- 
ment they construct to achieve those goals. Whether or not they do achieve their 
goals with particular children is influenced by many factors that are not within the 
control of the school "By the end of sixth grad£, children have spent only approxi- 
mately 7 per^^^BlHi lives in school During that time families, peer groups and 
other, societal agents have had a large influence upon a child's motivation for and 
ability to learn. Ultimately, only the children themselves can be responsible fot 
what they learn. ^ 

Historically, thp Responsibility of schools has been misplaced sc^that they have 
been held accountable for what children learn. Several unfortunate effects result, 
most important of which is that schools become defensive about those things they 
can't, control and are actually relieved of the burden for those things they can 
control — namely, the environments they create for children.* The Marcy 
Advisory Council and the Marcy staff as a whole have accepted a position of 
responsibility for the quality of the environment they create for children in the 
school. 

Given this definition of responsibility for environment, and my role as internal 
evaluator, the general design for that evaluation is as follows; 1) selection by 
school participants of priority goals, 2) assessment of the environment of the 
scfiool as it relates to those goals, 3) assessment of children's responses to that 
environment, and 4) feedback of information to relevant decision-makers. 
^ l\vill describe further th^jrnplementation of this design at Marcy: 

Selection of Priority Coals 

Throughout the first year of Marcy Open School's existence (1971-72), the staff 
and parents identified and then later revised a list of seventeen goals for children. 
'The Marcy Advisory Council, staff and Evaluatiprf Committee (a standing 
committee of the Advisory Council) have chosen three of those goals as being of 
highest priority for evaluation* % 

Goal 1: We want girls and boys to speak, listen, write, read and to deal with mathematical concepts 
effectively and confidently 

Coal 2: We expect that children will take more responsibility for their own learning in all areas — 
sQCi aL acacj gwnc, physical ^ 

Coal 3: We hope that children will increase their understanding of their individual rights and the 
rights of others 

These three goals are accepted as bein/generally of greatest importance for th> 
school. The other fourteen goals have not been abandoned, but for the school year 
1973-74 were not considered as a focds for general evaluations 

Assessment of the School Environment v 
A scftfcol's creation of an environment includes the arrangements provided for 
yse of time and space, the materials and activities made available for children, and 
the nature of the interactions that take place between adults and children. These 
are all dimensions over which the school has direct control and that should be 
consciously designed tofacUitate children's growth in goal-areas. At Marcy Open 

School my work as internal evaluator has been &,collect information about each 

I * * 

* Ruth Anne Aldrich, "Innovative Evaluation of Education/' Theory into Practice 9 (Feb 1974) 1-4 

52 



. ... - «i 

of these dimensions through use of classroom observations, mapping, photog- 
raphy, teacher questionnaires and children's interviews. I have sought information 
about the environment as it relates to each of the three goal-areas stated earlier. 

Time: How much time is given to formal instruction in goal-related activities instruction in reading 
skills, writing techniques, math skills, group discussion skills and techniques of responsibility? How 
much time is available for the children to informally use the skills they are learning reading for enjoy- 
ment, being read to by adults or other children, writing about experiences, applying coricepts of 
individual rights in informal interactions with others, and discovering effects of being responsible for 
projects? 

Space: How much space is available for goal-related activities? How readily available is that space? 
How undisturbed is tha^ space? 1 . ^ 

Materials: What materials are available to the child expressive materials, books, rnagazines/vvntingj^ 
equipment, listening equipment and recordkeeping materials? What range of ability and subject do^ 
those materials reflect? 

Activities: What activities are provided to encourage growth in goal-related areas, language 
development, mathematical concepts of balance, design, calculations, understanding of the effects of 
not following-through on commitments, and increasing sensitivity toward self and others? How are 
those activities chosen? 

Interactions: What is the nature of interactions betvyeen adults and childrenMs there an expectation 
that children express themselves verbally, in writing and through artistic expression? Is an expectation 
communicated to children that they take responsibility for their own actions? Are they allowed to fail 
and ^o team from that failure? Do adults express a resfiect for the rights of children and communicate 
an expectation that children will respect the rights of others? 

This list is not exhaustive, but all of its dimensions are clearly within the control 

of the school— and school people should make conscious decisions about them. 

<> * 

Assessment of Children's Responses 

Though the schooj^uust not be held directly accountable for what a cfiild learns, 
because of other influences described eaHter, a part of its accountability is in 
knowing how children are responding to tlVe environment it has created and in 
modifying the environment if the children's responses are unsatisfactory. There is a 
distinct difference between knowledge of what a child learns and knowledge of 
how a child is responding The question is not one of what a child can do, bCit 
instead of what the child does. This difference is reflected iri Marcy's goals which 
state the desire that children will read effectively and confidently, rather than that 
they will know how to read, and the expectation that children will take responsi- 
bility for learning, rather than that they wilifknow how to take responsibility. The 
evaluation, therefore, must look at the afestion of what children actually do 
within the school -environment. \ 

To facilitate gathering this information, I have selected a sample of 20 percenf of 
the September 1973 enrollment at Marcy School from arViong children o&each age 
group, children of racial minorities and of the, racial majority, and children 
categorized as special education. Through classroom observation, children's 
interviews, photography and collection of classroom and school records, the 
following information has been made available for each of those,children: 



Coal-related ness of activities during one day in October and one day in April 
Participation in and products from various school interest<enters 
, Samples of weekly or monthly classroom-activity records 
Participation in special education and counseling programs 
Growth in language and math skills over a two-year period 
Crowth in affective characteristics throughout each one-year period 
Standardized test scores in reading and math 
Excerpts from end-of-year-reports to parents 
/V~~'lected samples of art work and writing ^ 

ER?C !. V 5J ; 



I* 



( 



This informal has provided a profile of the involvements and growth in goal- 
areas of the sample of children 

Feedback of Information to Decision-Makers " ■ # 

For evaluation to serve an ongomgformative function, we need to consider the 
ioformation as it is collected, rather than at some pre-specified ^ndpomt in tiro t e. 
Thus, feedback must be a constant process All information that I collect at Marcy 
is given to the teachers involved, as soon as feasible to do so The form of 
communication may be ejither written or verbal Specific details are included, 
identifying children, activities and times so that the information can be used in 
appropw&te^and meaningful ways for planning 

In addition I make larger Summary reports available to the total Marcy staff and 
Marcy Advisory Council A preliminary report is presented in midyear,, anc{ a feport 
summanzing-all the mformation^collected for the year is presented in May In eacb 
case I generalize information so that individual classrooms and children are nQt 
identifiable to"the reader, bufc I mcJude sufficient detail so that decisions yn be 
made on the basis of the data 

Having received the information, the* decision-makers themselves (be they 
individual teachers, schoolwide committees,, Advisory Council members or 
parents) have responsibility to judge the success of the schpol in providing* 
adequately for children's growth tj^ey are also responsible for making decisions 
about possible modifications of program and how to best achieve them Such 
decisions might involve rearrangements of space, changed grouping of children, or 
sharpening of teachers' skills through staff, development. * 

Conclusion , 

This model of school evaluation has been implemented in an open school. The 
implications of such a process are not limited, however, to open or informal 
education Schools should be accountable for what they provide for children, no 
matter what the structure of the program might be. Regardless of the setting, 
evaluation can serve the function of reflecting information about' that* 
environment ' * \ . 

Schools should be growing, evolving institutions aware of their successes end 
designing change for their failures Through a realistic definition of accountably 
and an active program of evaluation that process can become a reality fQr*all 
schools , • 

Children's " 
Interviews 

Nancy Ann Miller * 

- t * 

Over Jhe pasf several years the staff at the University of North Dakota's Center 
for Teaching and Learning (formerly the New School for Behavioral Studies„m^ 
Education) has.been working todeveJop new forms of evaluation As sponsors of a 
follow Through approach, we have directed much of our evaluation.eff<jrt toward 
developing interview instruments, with particular emphasis on community 
participation , ~ 



This article will attempt to provide a generaf description of a children's inter- 
view, 'And What Do .You Think?/' and to describe wayS it can be used as a 
, feedback tool tor persons attempting to understand the classroom interactions of 
teachers and children 

Development of. the Interview 

W6rk on the interview 1 tirst began in Chicago, 1970-71, 2 by a research staff 
having input into the process of opening" four inner-city classrooms in two large 
Chicago elementary Schools The staff soon realized that much of our discussion of 
where children were or what w0uld be good for, them was based on our 
- ^interpretation of their classroom actions with" little testing of our speculations 
against whaJJthe children themselves perceived about who they were and what 
thet needed m v 

We, began evolving the questions by informally talking with children right in the 
* classroom about what they were doing, wanted *to do but couldn't, and so on. A 
useful reference in the process was the approach used by Piaget 3 when talking with 
c hildren about tht ir conceptions of natural phenomena— particularly his emphasis 
on so tormulatmg a question that it allows'children to respond from their own 
experiential *and conceptual frameworks 

Our tirst interviews were conducted by three staff members who had spent, many 
hours in the* classrooms as participant observers over a period of six to seven 
months prior to the interviewing Through their varied exchanges wfth the children, 
the questions and sequence were revised many times and, as a matter of fact, 
continue to be revised up to this trme. t 

Thf present interview consists of approximatel»f*Tvventy open-ended questions 
about thf ^ htlcj s ac tivitie^anci involvement in s*!tiooI, teacher-child relationships, 
peer interaction, and the child's view of the\classroom as an overall learning 
environment . - 

Children are usuallv interviewed outside the classroom,, one at a,time, to avoid 
disruption Interviews are taped initially and then later transcribed so that they can 
be analyzed more easily Length varies from thirty to sixty minutes We have found 
twelve children (from a class of twenty-five to thirty children) to be a manageable 
classroom-sample size — small enough to allow completion of the interviews and 
large enough to provide a g^od picture of the interaction patterns and activities in 
the classroom as seen trom the children's perspective Confidentiality of the 
interviews needs to be stongly emphasized Throughout the interviewing process, 
care is taken to avoid identification of individual children and parents. . 

The effects ot such variables as the setting of the interview and age, sex, race and 
familiarity ot the interviewer must be considered. However, our findings tend to 
show that most important to the quality of the interview is the ability of the inter- 
; viewer to structure relevant questions and to listen intently and'nonjudgm£ntally 
to the'chilci These' skills, which jequire training and practice,, are aflso those 
necessary for teachers in their interactions with children — and are^to6 often over- 
looked in teacher-preparation programs ' 

Uses oMhe Interview „ * 

r ♦ The present children's interview has been used extensively at the Center for 
Teaching and Learning in .both Follow Through classrooms and' a wide range of 
other classrooms (grades 2-7) across North Dakota. It, has served primarily to 
provide useful information to teaching staff — most productively when its results 
are given .in a summarized feedback, -along with results of teacher- and parent- 
interviews, in a team setting where discussion and clarification'are possible 4 

An added benefit beyond the feedback to teachers is the opportunity for those in 
* teachelr preparation to take part in the interviewing. An often reported result of this 
q ~erience is an increased sensitizing to children's experiences in the classroom 

ERJC " 55^ « ' 



Interviewers and teachers are often .surprised at the depth. and range. of the 
children's perceptions To increase this kind of two-way 'understanding and 
exchange between teacher and child is the major objective of the interviewing 
process \ ' 

It is interesting that some children w the interviews have expressed a desire to ( 
'know more about v\hat the teacher thinks and feels For example, one third-grade 
child, whose teacher was under pressure to reverse the label of having the noisiest 
classroom in the School, responded to the question, "Tell me something you would 
like to know more about/' with "How he [teacher] feels " 

One area ot the interview deals with how the child prerceives the teacher in 
terms ot what he she does, likes to do or the kind of exchanges the child has with 
the teacher The following responses are trom two children in two difterent class- 
rooms to the question, "Tell me what your teacher does in your classroom " 

i < 

She goes ground and kelps people, like if they ha\e a problem and can t figure out their math, she 
c omes over and shej^geuhyjoma cube box and show them how to divide a fraction And when 
she plans with us writes on we board when we plan And, like when we ha \,e discussions she usually 
leads them and tells us some good stuff When we have protects, she comes up tousjnd brings us a 
book and shows us the right pages to look at and stuff like' that 

r 

He works [unintelligible] and he checks papers and people who are messing around, he tells 
them to quit and that and if they re throwing stuff, he hollers at them so they quit throwing it 
Sometimes when they do stuff they re not supposed to they ha^etosit down in their desk and that He 
checks quite a bit of papers and he checks the math books and we have like— every week we have 
like- -I m tn a ft book and somebody's in an A book— they ha\,e sheets that you nave to work with, and 
.' kv tint' Jay he sgot a B thing that goes up to the chalkboard, next day he s got a C, and so on like that 

When these same chilciren\were fesked what their teachers like to do best and 
also if and when they talked to J^eir teachers, their responses were / 

First child:- Well, not to go o\er and like tell people to get to work all of the time She likes to help^ 
the different people that need help and that way she can go around and help everybody and give them 1 
ideas tor planning and show them how to do a fraction a lot easier and^but a lot of times a lot of kids 
get noisy and she has to go o\er and tell them and she doesn't like that very well because then she 
can't help other people and show them what to do and easier ways and stuff 

Second child:- / don't know ♦ ' You dor\'t talk to him much, iust ij you have to get some work or 
something Sometime* you gp up and talk to hum about how you're working in your books and he tells 
you to come up and talk to him about what page you re goi"g to work and all that ^ 

Asked if the teacher talked to them about what they were doing, the first child 
responded 

Vos, she talks to you If you're working on your SCS, or like building she comes over to you, and says, 
"Well, do you need a book on it, or do you need materials on it?" Some guys are making an ear harp— 
and she asked them, "Do vou need some screws?" and the person sa ys, r Ves, and she asked some 
person to go uptown and get some screws for him, some other kids 

To the same question, the second child responded: • " . * 

He iust calls us up anytime and he asks us what you re going to do today and that stuff and J don t . 
know — once a week or so he does it 

These radically different pictures of a teacher's role do not necessarily indicate 
that all the children in thes$ two rooms perceive their teachers in the same way 
Patterns often emerge, however, and teachers may discover that, their own 
perception of their role in the classroom differs from that the children have. 

Some of the exchanges are very warming T+ie following dialogue leaves out the 
thoughtful pauses of the third-grade girl when responding to the question, "When 
is your teacher the happiest?" 

[She] hkes us to do our work—when we're good. * 

"How can you tefl when ypur teacher's happy?" 

/ look at her eyes and I can tell that she's mador happy. 



* • r * 

"Koy^in you telLb'y the eves'" 

W>ien fhey re happv fhe/r eyes look happy like they have a sm//c on fhe/r eyes When mad fhey /ooA 
like a ball ot tire— sometimes they get red 
' I lies reajlv net r red> 

Uhhb-uKh 

'( outcH learn this— how to tell liovv people tool by looking in their eyes?" 
j V\e// mv mother taught me [laughs] she could tell sometimes if I was lying by looking in my eyes 
(hit \ou could learn it too * 

tfv looking straight in their eves like voure going to hvpnoii/e* them 

One c an also uain information about how the children perceive sex roles When 
asked. \re there some things onl\ bov s c an do in your classroom?," children gave 
the following responses 

fhe\ [bovs] know more than \ou do but I know more math than they do 

( >n,v /)ou vvffsV/t W h\ t/ftecause girls get hurt easier than boys can 'f 
s.i \ >\,><nl \Vh\' \frji(fwe [girls] might cut ourselves or something 

1 c wvt responses are mven to the question ot what things only girls can do, but 
one sue h response was 

Ihvwrls tan knit thv\ un/> sew b\ hand Jhe\ [boys] usually have to be told to do their work and 
the girls don t 

Mthouuh man\ factors influence children's perceptions of sex roles, teachers 
,sl u <uld be aware ot the role their ow n actions play m perpetuating stereotypes that 
hunt a child's mouth 

'Xrujt fitT important aspect of classroom living is the nature of the interaction 
' the c hildren and whether then a iow eat h other as resources ahd helpmates 
Wi^t children preler to work with others and, when asked how citing <so „helps 
them *t\pic al responses are J* 

/ ikv it \ou re hd\mu ; 'tnultivs ma\bv thv\ wauld kno*. help me understand j^effe/- J 
* \\v!l Jikv it the otht r a, d ti'uesn t know something you can tell them and explain if and if you dgn t 
knrjw somrthmg the\ <an sort ot explain it to \ou \ou can learn more things that way, You can help — 
voti (an help himjearn stutt didn t know and* he can help you learn stuff you dtdn t knoa. 

Some children will say that whether they like to work with others depends on the 
spetitu ac tiv itM involved One child, however, preferred working alone because, 
Ihcn I can talk to myself out loud 

Open the obvious relationship between the information one gets and the form 
and manner of asking a question, the interview can be useful too in suggesting 
questions Teachers shodd be asking about their classrooms The problem of 
str tic luring thejquestioff so it does not suggest a response or limit children in 
responding from their own experience is indeed hard to appreciate unless one has 
const lously worked at it Here the process of the interview could be useful to both 
teachers and children — simply reflecting about information needed and finding 
questions to help elicit* K Children can an'd should be more involved in this process 
of asking questions about their environment In collecting information, it is easy to 
get into the pattern of knowing the answer before one asks the question 

We have also found in the interviewing that some children take much 
convmc mg that we really want to know what they think Generally/ the children 
though sometimes shy or nervous will respond quite sincerely to thje questions if 
the interviewer is genuinely interested Otherwise, some children will play the 
game of giving us what we want to hear 5 

Core purposes of the children's interview are to encourage children to say what 
they think and to stimulate those who listen to be affected by what is said. 
Unfortunately, many classrooms do not have time for such dialogue, as there is too, 
much "work" to be done. Much of what happens in schools is based on "what 
^- L ould be done" as determined by someone other than the person doing it. Asking 



/ 



' ""/.-• - 

children what they think and responding to.their answers will not' only helpJhem 
clarify their thoughts, feelings and needs but extend and strengthen thefcL^i-n 
addition to practicing expression of their thoughts, the children are also 
themselves learning to ask questions and hopefully to influence change 

In many classrooms children are not encouraged to question or to express their 
confusions Eliciting responses can be difficult; interest, acceptance and guidance 
are crucial ingredients of a supportive environment The following responses are 
from cKildren asked what they would like to know more about or projects they are 
involved ;n . 1 

"Tell mo something you v^Rild like /o know more about?" 

/ would like to kno*> everything in the whole world Do movie stars nai*e to know everything there is 
to know * 

Do you think you could know everything?" 

Well \ou would nave to have a big encyclopedia with everything there is to know— but tt would be 
so fog th.it you would need a ladder to climb to turnfc pages and it would take over 100 people to lift 
it ' X 

"f saw a lot of projects going on in your classroom Can you tell me wha^t a project is?' 
* (Another fchild ] Well, lifie if you're working on the stock market you write reports on it and you read 
about it and want to know<more about it and then like ypu have hke a thousand and you give that to 
everybody and you have them buy stocks andyou shov? them^hen—hke you can be a stockbroker 
and shop the other guy how to do it and plan and make a graph if the stocks go up or down and write 
reports on, /f— how it's working out And, you can show the other kids how to do it and then make 
stocld arid it really goes nide'for a while \ 

- We^eed to learn more about space atid how it affects us A typical response to 
^fhe question, "Do you have a favorite place in fihe classroom?," is. 

Ve^hfe-one o7 the carpeted corners in the room Then it's feally fun if you go back and there's not 
much ncjse there and it's a real nice cozy plac&that you can Xork in and there s music back there and 
* you can work pretty fast It's really fun , I 

While many children are bothered by ngise anci interruption, the answer is not 
simply to restrict talking and movement but to consider spatial arrangements that 
provide some isolation and facilitate quiet movement 

Another loint our interview experience has reinforced is that some children 
need a surprBingly long time to respond to a question and do not respond well to 
pushing One must work at "listening" and responding nonjudgmentally. Children 
often have a good understanding of their needs but have difficulty knowing how to 
find help in the classroom, For example, one third-grade girl said early in the inter- 
view that she wanted to^am more about math. Later, she replie'd to the question, 
"What would you like for the teacher to stop doing?," with Giving us so much 
math When asked to clarify this discrepancy, she responded that the teacher gave 
them a lot of math but never really showed them how to do it This perception may 
or may not have been shared by the other children inThe class, but it is important 
to know about especially since we can continue along a direction for a long time 
before knowing ;how> ineffective we are. 

Teachers can use the questions from the interview informally in the classroom as 
a way of providing for continual feedback in conferencing and small-group 
discussions For some children the one-to-one exchange is the most comfortable 
setting, but they also need to develop the ability to express themselves in a group 
setting Questions can also be built into inquiry sequences. For example, the 
teacher asks for ideas for changing the room and then, in a nonjudgmental fashion, 
receives suggestions, guides toward a consensus upon one suggestion (or the 
number feasible), and has a group actually plan and carry out the change. This 
sequence involves getting feedback about the classroom plus helping the children 
become more articulate and able to plan and act. It is only one simple example of 

' " r " 58. ' , 



stimulating exchange between teacher and children and of involving children in 
classroom planning and change. 

The interviewing process itself might be a profitable learning experience for 
teachers who go to another classroom and do sorrte interviewing. Again, the 
problem of confidentiality has to be very carefully- considered Certainly, visiting* 
~oth"erclass"rooffrs"and informally talking with children about what they are doing, 
their likes, dislikes, etc , can be a fascinating learning experience. If interviewing of 
a representative sample of children in a classroom were to be done, persons not 
directly involved in the classroom might more productively do the interviewing 
This kind of interchange too is a sensitive matter and should be approached so that 
it is a positive feedback process for the teacher. The interviewing should be done in 
the framework of a suppurtive staff development process 

Presently members of our New School staff are^ attempting to relate the 
children's interviews to teacher's interviews, scaled on classroom dimensions 
described by Patton 6 Hopefully this would help us better understand the relation- 
ship between the classroom structure and happenings as perceived by the teacher 
as well as by the child A study is also in progress of the variance of children's 
responses in different classroom settings and the relationship between the 
responses and the setting. 

Evaluation and accountability, though integral to learning and teaching, have 
become mulfgple-headed monsters in education. Often more, money and energy 
are spent on-Judging success and failure and on producing packaged success than 
on supporting teachers and children in the difficult task of learning In thrs 
atmosphere, neither children nor teachers can admit their weaknesses for someone 
is always very willing and ready to hand them judgmental criticism 

If the process of evaluation is to be a positive stimulus in learning, it cannot be a 
continually one-directional process with one side always setting the goals and the 
process for attaining them and having the power to determine success or failure 

The kind of exchange the children's interview hopes to stimulate calls for 
increasing thfe active involvement of the children in their own learning and helping 
the teacher to better understand their experiences Tfje children's interview would 
be sadly misused if it were ever to evolve into a "standardized" instrument used for 
accountability purposes As an evaluation tool, however, it can be used positively 
to help teachers and children "take a reading'' of where they are. in the process of 
learning and teaching. 



/ 



Footnote} 

^Most recent copy of interview is available from author 

2 WOrk began on the interview in Chicago under a Ford Foundation grant directed by Daniel Scheinfeld 
*J Piaget, A Child's Conception of the World (Totowa, NJ Littlefield Adams), 1967 

4 3d Quarter Report, Teacher, Child, Parent Interviews, submitted to National Institute of Education, Center for Teaching 
and Learning, University of North Dakota/ July 1974 t * 

? The role of questions in the classroom is further emphasized by Francis Hunkins in his book, Questioning Strategies and 
Techniques (Boston Allyn & Bacon, 1972), in which he discusses both the importance and strategies of questioning in the 
classroom Too often classroom questions are based on a hidden agenda of right and*wiong.answers and are one-directional 
orteacher-to-child * ^ » 

Patton, "Structure and Diffusion of Open Education A Theoretical Perspective and an Empirical Assessment," 
ohshed doctoral dissertation, University of Wisconsin, 1973. * ' 



Reflection 
inTeaching 

Anne M. Bussis and Edward A. Chittertden 



In v\hat ways do teachers think about teaching? How do they conceive of the 
complex pattern of events that mark the scrVool day? What assumptions do they 
hold about learmngVand development? What are the grounds for their planning, 
provisioning and evaluation? These and similar questions about teachers' beliefs 
and understandings become increasingly important witrf change in the direction of 
more complex classrooms and greater teacher responsibility fpr curricular 
decisions t > v 

During the past few years we have been interviewing teachers in order to study 
some ot these questions Although in-depth interview methodology is not common 
in educational research, it fits well with a phenomenological view of man The 
phenomenoiogical tradition in psychology historically has emphasized attitudes, 
beliefs, understandings, values and perceptions as major determinants of human 
behavior Applied to education, this view places greater emphasis on the impor- 
tance of a teacher's internal perspective (thinkfog/valunTgJ than on the importance 
ot a particular method or startegy in determining what happens in the classroom % 
Depending on the particular theorist one reads, this internal perspective has been 
referred to as "life space," "assumptive world," "belief system," "reference 
system, ' and the like George Kelly's (1955) notion of "personal construct systern" 
seems particularly appropriate, however,, because it so clearly suggests an image of 
man as activist — an image th^t is central to all phenomenological theory (see the 
methodology article, pp 7-12) • 

Teaching-Learning Constructs 

A personal construct means what the phrase implies — a personal construction 
or representation of some aspect of reality*that is the result of ^n individual's 
interpretation of his wqrld. A construct may be likened in some rejects to a 
concept, it refers to objects or events that a person .categorizes in hi\mind as 
somehow similar in meaning It js unlike a concept in that its boundariesN^the 
range ot experience to which it applies — are personally defined on the basis of 
each individual's past.history. But constructs are not merely ways of interpreting 
and labeling what has happened They arealso ways of predicting and anticipating 
events, as forerunners of action For example, the teacher who construes 
block-building primarily as large muscle exercise will make different predictions 

The authprs ot this art.tr le have been involved in several projects related to questions ol the role of teachers in educational 
change These projects have involved interviewing, visiting classrooms and participating in workshops The comments 
included herein are based, to a considerable extent, on the experience and findings of their principal interview study The 
full report of this study, funded by the ford Foundation, is now being written 

Reprinted with minor modifications from Notes from Workshop Center for Open Education. Spring 1974, pp 2 7 Copyright 
1974 by the Workshop Center for Open Education, Shepard Hall, City College, Convent Ave & 140th St , New York, NY 
10011 Reprinted by permission T ()0 



about this„activity and undoubtedly act in^djfferent ways from one wh6 construes 
it as^he child's concrete representation of thought 

. Jb the extent that a person is opert fo feedback about the consequences of his 
action, predictions via constructs will sometimes prove correct and sometimes be 
tound^wantmg Thgs, the revision of constructs is seen as a function of a person's 
willingness toact on his own best judgment and his openness to feedback from the 
environment Simply "having^a new idea or feeling/' while important in its own 
right, is relatively inconsequential for affecting behavioral change Translating an 
idea into action and experiencing its consequences count for much more and 
constitute the basis of personal (as opposed to "academic") knowledge and 
learning This last assumption points up the obvious importance of experience in 
shaping personal constructs and suggests that, if significant progress in leaching is 
tooccur, teachers need a quaht\ of experience supportive of personal exploration, 
experimentation and reflection ' m - 

What we have been mtere^tecLin studying, tfoen, are the personal constructs of 
teachers regarding the tea'ching, learning proce'ss, as well as their perceptions of 
major supportive and inhibiting influences on their professional development Our 
interviews were semi-structured and as informal as possible,, encouraging the 
teacher to stress and repeat whatever priorities and concerns were uppermost on 
his or her mind The questions were open-ended and designed to elicit judgments,, 
opinions and reflection During the first part of the interview,, questions relating to 
the teaching progbss were discussed in some depth — such topics as room 
arrangement and the value of different materials, the organization of the day, the 
nature of instructional planning, the role of children's interests and emotions in 
learning, how to evaluate children's learning, and $6 on The second portion of the 
interview centered on the teacher's perception ot supportive and nonsupportive 
influences on his or her professional development, including the role of advisers, 

jother teachers, administrators, paraprofessionals, parents, workshops, course work 

, and school policies-. 

Constructing Surface Content 

. One of the most interesting problems of the study has been to ^interpret the 
ncjtion of "cumculm" in a psychological rather than logical way, in order to reflect 
the broad range of teacher understandings and meanings The important questions 0 
trom a / phenomenological view are How does the teacher conceive the 
curriculum? What is the teacher's personal "curriculum construct"? In attempting 
to deal with these questions,, we have distinguished two levels of curriculum At 
one level,, curriculum refers tb the variety of activities the teacher^plans for and 
encourages as well as those*he/she may merely permit or tolerate Because this is 
what an observer would see going on in the classroom, we have thought of this as 
the surface content of curriculum. 



Organizing Content 

At a deeper level,, curriculum has an organizing content which consists of the 
learning priorities' and concerns a teacher holds for children. To oversimplify 
matters, what does the teacher want children in his or her classroom to know, do, 
feel, think or care about? What qualities of learning are valued and are trying to be 
promoted? As it turned out, these priorities and concerns were not too difficult to 
identify from the recurrent themes that permeated the interview - and they 
ranged from quite comprehensive ones to relatively narrow and conventional ones. 
For example, a concern with children knowing "what they are about — and why" 
(ie., a concern with the quahties of intention and reflection) was considered a 
>rehensive priority, whereas empha$is_pfl children demonstrating basic skills , 

\\C 61 



60. 



ERIC 



and tacts expected at a particular grade level was considered relatively narrow We 
should point out that " comprehensiveness'; in this respect refers not only to the 
extent to w o hich a priority engages the totality of children's cognitive/emotional 
resources, but also to the subsuming power of the priority Thus, a concern for 
intention and reflection generally subsumes a concern for children's acquiring 
essential facts and skills although these are not viewed as tied to a specific 
grade level 

Making Connections * 

Having distinguished between activities in the classroom (surface content) on 
the one hand, and learning priorities (organizing content) onJhe other, another set 
ot questions deals with the connections and interconnections between them First,, 
does the teacher perceive mkny, few or any connections between his or her 
priorities and what is going on? For what purposes— in the teacher's mind- are 
c hildren building a block castle, or looking at leaves through a magnifying glass, or 
making books filled with their own stones? Second, does the teacher conceive of a 
particular set of activities as serving only one priority, with a separate set serving 
another, and so on? Or, are activities viewed from many perspectives and seen as 
potentfally valuable for a number of learning pnorities? The nature of these 
connections and interconnections theoretically becomes a critical factor in the 
degree ot psychological organization or structure that pervades a classroom 

Inferring Priorities 

It should be pointed out that teachers were never asked directly about curncular 
priorities Rather, these were inferred from the substance of comments made m 
response to the many questions "thoughout the interview. In all, seventeen 
priorities were' identified, eieven of these having a cognitive emphasis and six 
having' more of a personal/social emphasis. Not only did ' teachers vary 
considerably in^the number and nature of priorities for which they were coded, but 
also in the degree to which they seemed consciously aware of having priorities at 
all One particularly interesting finding is the way in which /the curriculum 
construct appears to relate to a teacher's feeling of confidence .Greatest 
uncertainty was expressed by those who planned for a wide variety of activities but 
whose priorities tended to be dominated .by basic t skills and good behavior 
concerns These teachers were experimenting with surface curncular changes, but 
had difficulty seeing connections between rnany of th;ese activities and their major 
concerns While they believed in "an abstract way that worthwhile learning should 
be going on during these activities, they were struggling to understand itand were 
frequently worried about it In contrast,, teachers who planned a wide variety of 
activities and who held comprehensive curricular priorities could more often see 
connections between what was going on arid what they were trying to promote — 
as could teachers holding relatively narrow priorities and not engaging in much 
surface curriculum experimentation , ' * 

Examining Intuition 

Obviously, we cannot do justice to the observation about? teachers' curncular 
thinking and feelings of confidence or uncertainty in the short space of this article 
This is not the intent Rather, our purpose is to point up some questions one'can 
ask about teachers' thinking and to raise a bask issue How important is it for 
teachers to be able to analyze, reflect upon and articulate their basic assumptions 
about teaching? Although there are differences of opinion on this matter, it seems 
to us that analysis and articulation of the teaching/learfhng environment are im- 
portant in at least two respects. First, analysis and articulation are'cptical compo- 

* 62 



nents of the teacher's ability to communrcate to others to administrators, to 
parents, to other teachers (and, in a much more subtle 'and complex way' to 
. children) This certainly is the most commonly mentioned and widely debated 
sense in which analytic articulation can be seen as important But second, and less 
commonly discussed, analysis would seem important for the teacher's evaluation 
ot his/her own efforts — especially when things start to go poorly or to stagnate 
What conscious frame, of reference can the teacher bring to bear in an attempt to 
analyse what is happening? Can he/she look at the relationship of curricular 
concerns to surface content and begin to sort out priorities! When" called for, can 
teachers examine the words and not^he uses? We are not advocating that a teacher 
should be able to formulate a rationale or purpose for everything he/ she does in 
the classroom, and we certainly are not denying that the immediacy -aac 
complexity of teaching demand heavy reliance on, common sense and intuition 
The point is, can the intuition later be examined in a, reflective way? 

Toward Ongoing Professional Development 

This issue has a direct bearing on one's view of professional development 
Perhaps thenfost prevalent notion of teacher development is one that implies that 
the engagement of a teacher's critical and conceptual faculties? will be most 
intense during preservice trammg^nd the initial two or three years of experience, 
<yid after a certain level of mastery and efficiency has been acquired, mservice 
education is more a matter of maintenance and retreading In a recent study 
(Zahorik,, W3J in which teachers v\ere asked to throw off the constraints of their g\ 
actual teaching situation and to imagine an idea I teaching situation, 'findings A\l 
suggested^that the options many teachers actually perceive in teaching are options 
between two or three accepted methods to achieve a given goal As advocated by 
open education, howqyer, when the options broaden to include not onbpnon- 
traditional activities and methods, but the very goals themselves,, then curricular 
decision-making becomes considerably more complex Commensurate with this* 
view of teaching is a conception of professional development as ongoing — with 
the goal being to sustain the critical reflection and conceptual growth of teachers 

REFERENCES / 

Kellv Ceorgp fhe P^vcholo^ ot Personal Constructs, Vol 1 New York W VV Norton 195* 
/dhonk I What Good Teaching Is lournal ot Educational Research bb (V)7i) 415-40 



63 



ERIC 



Selected 
Bibliography 



Brenda S. Engel 
On Evaluation in Genera! 

Combs,' Arthur W. Educational Accountability Beyond Behavioral Obyect/ves. Washington, 
DC Association for Supervision and Curriculum Development, 1972 A critique of be- 
havioral objectives and learning theory as inadequate bases for evaluation Importance 
of holistic view of education, "personal meaning" of curriculum as key to student be- 
havior,, implications for teacher accountability. 

Duckworth,, Eleanor "The Bat-Poet Knows, Evaluation in Informal Education." Music Edu- 
cators journal, Apr J1974 70-72 A brief article emphasizing the need for adults to try to 
understand children's.. work, suggestions for questions a teacher might ask himself about 
a child and music Randall Jarrell's poem, The Bat Man Knows, ".is offered as a parable 

Evaluation Reconsidered New York Workshop Center for Open Education, 1973 (an 
occasional pape'r) Various articles including Toward the Finer Specificity, by Lillian. 
Weber, "The Horizontal Dimensions of Learning," by Anne Bussis & Edward Chittenden, 

, "Toward a Shared Appraisal, "by Charity james, "Documentation, an Alternative Ap- 
proach to Accountability/' byTatricia Carini, "Evaluating African Science Case in 
Point," by Eleanor Duckworth; "Report from North Dakota," by Vito Perrone 

Hickey, M E "Evaluation in Alternative Education " NASSP Bulletin, Sept. 1973. 103-109 
Some reasons for the difficulties common to evaluating alternative programs,, suggestions 
-for implementing broader, more comprehensive kinds of assessment 

Notes f(om Workshop Center for Open Education. New York Workshop Center for Open 
Education, Dec 1972, Seven articles including. "On Accountability," by Lillian Weber & 
Celia Houghton, "Record* Keeping," by Bonnie Brownstein, "A Teacher's Log," by Janet 
„ Arndorfer,, "Parent-Teacher Conferences," by Ann Hazlewood, "Parent Interviews," by 
Michael Patton 

Scriven, Michael The Methodology of Evaluation. Bloomington Indiana University — 
Social Science Consortium, 1966 A distinction made between "goals" (a final judgement 
of worth) and "roles" (uses along the way) of evaluation, in reference here t6 curricular 
materials,, detailed analysis of pros and cons of formative and summative models. . 

Shapiro, Edna "Educational Evaluation, Rethinking the Criteria of Competence" School 
Review, August 1973 523-48. A critique of generalizable program evaluations and an 
argument for research designed for particular situations — also for evaluation fn the 
service of-program improvement rather than as justification. * ^ 

Stake> Robert E. "The Countenance of Educational Evaluation " Teachers College Record, , I 
Apr 1967 523-40 A clarification" of some of the issues involved in educational evalu- * 
ation the nature of the evaluation (descriptive and/or judgmental), what is*to be' % 
scrutinized (antecedents, transaction, outcomes),, bases for judgment comparison of N 
programs or absolute standards) and purposes or uses. ~~ 

Thts bibhpgraphy is by no means a.general listing of refereocwyfi^he field of evaluation and testing The /eferences cited 
have been narrowly selected for their particular relevance rojrie contents of trt»s*^ublication 



64 ;•• . ■. > 



Zimiles/Herbert "An Analysis of Current Issues in ^Evaluation of Educational Programs 
Disad\antaged Child, Vol II Headstart and Early Intervention, 1968 547-54 Operational 
_ evaluation fas opposed to, or preliminary to, outcome evaluation) suggested for inno- 
vative programs as more goals-related and process-oriented 

. "A Radical and Regressive Solution to the Problem of Evaluation "New York- 
Bank St College, 1973 Diagnosis ofmethodological weaknesses in recent early child- 
hood evaluation designs, recommendation to shift emphasis from impact studies to 
. assessment of educational environments 

On Open Education 

Bussis, Anne M , & Edward A Chittenden Analysis of an Approach to Open Education 
Princeton, NJ Educational Testing Service, 1970 A preliminary, basic statement by a 
research group in response to need tor methods of assessing programs in open education, 
conceptual framework clarified and implications drawn for evaluation and research 

Walborg, Herbert J , & Susan Christie Thomas Charactensf/cs of Open Education Newton, 
MA TDR Associates, 1971 An examination of open education concepts taking oft from 
the conceptual framework presented by Chittenden and Bussis irv Analysis of an Ap- 
proach to Open Education (see above listing) Characteristics identified through analysis 
of literature on the subject and questionnaires Includes several teacher/ classroom 
assessment instruments » 

On Testing 

deRivera, Margaret "Academic Achievement Tests arid the Survival of Open Education " 
Newton, MA Education Development Center, 1972 'An analysis of exactly how=standard- 
ized tests serve to undermme\and threaten the tutufe of progressive school programs 
based on experience with the Philadelphia Tollow Through 

Karier, Clarence ) "Testing for Order and Control in the Corporate Liberal .State '-Educa- 
tional Theory 22, Spring 1972 108-136 An historical account of cooperation between 
large private foundations and the standardized testing establishment in twentieth-century 
United States in maintaining the "meritocracy" and controlling the socioeconomic struc- 
structure 

McClelland, David C "Testing for Competence Rather Than for Intelligence " American 
Psychologist, Jan 1973 1-14 A critique of current test practices and inferences drawn 
from them that perpetuate a myth of meritocracy, suggestions for alternative kinds of 
tests % ' r% 

McGarveyJack 7 'Standardr£ed / ?ests, 5 Steps'to Change "-Learning, Apr 1974 24-26 An 
optimistic view of general dise.hch'aritment with standardizecftests in pockets across the 
country, along with an^outlme of five steps to be taken .to assist change 
Meier, Deborah f$eadm&failure and the Tests. New York. Workshop Center for Open Edu- 
cation, 1973 (ah occasional paper) An indictment of standardized, normative group 
readinrg tests, based on seven cultural biases which are identified and examined, an alter- 
native suggested of longitudinal data-collecting 
^ Meier, Deborah, Ann Cook, & Herb Mack Reading Tests, Do They Help or Hurt Your Child? 
New York Community Resources Institute and Workshop Center for Open Education 
Examples from Actual reading tests selected to demonstrate some'of the confused 
^ thmkirrg, cultural bias and ambiguous illustratiohs common to such^ests Comments by 
' the authors ; ~~ 

Piaget, Jean "The Right to Education \n»the Modern World." Freedom and Culture 
(UNESCO) New York Columbia 4 Universit"y Press, *1951 Pp 69^116 A strong indictment 
(pp 84-86) of academic examinations within the context of a general statement on 
educational rights • , " 

Silberman, Charles Crisis in the Classroom. New York Random/House, 1970. Mention of 
testmg^in this standard workjncludes a brief discussion of attitudes toward testing in 
O gUnd and testrng as a rating niethod m the U S A 



Other Methods of Evaluation 

Aldnch, Ruth Anne "Innovative Evaluat^onof Education " Theory Into Practite, Vbl XIII,, 
Feb 1974 1-4 School accountability defined as responsibility for instructional envi.CQnW 
ment rather than for learning outcomes 

Carini, Patrici F 'Documentation, an Alternative Approach to Program Accountability" 
North Bennington, >!'T Prospect School An argument for seH-reflective, process-oriented 
documentation based primarily on biographical/historical method Samples of record; 
keeping from the Prospect School included * \ 

Cohen, Dorothy H & Virginia Stem Observing and Recording the Behavior of Young 
Children New York Teachers College Press, 1972 What to look at and how to record 
anecdotally, directed toward nursery and kindergarten observations 

Duckworth, Eleanor R A Comparn>onhtud^ for E\aluatmg Primary School Science in Africa 
Newton, MA Education Development Center for African Primary Science Program 
description of the problems, procedures and results of the actual evaluation "A Field and 
Planning Study for the Appraisal of Reading in dpen Classrooms " Princeton, Nj Edu- 
cation Testing Service, 1973 An analysis of the inadequacy of conventional reading tests 
and plans for the development of more useful and appropriate means of assessment By 
the Early Education group at ETS. t ' 

Hawes, Gene R "Managing Open Education Testing, Evaluation and Accountability 
Nation s Schools, June 1974 33-47 A description, with examples, of some contemporary 
alternatives to traditional evaluation 
Perrone, Vito, & Warren Strandberg. "A Perspective on Accountability " Teachers College 
Record, f eb 1972 347-55 An argument for a broader, rrfere inclusive bas;s for account- 
ability (instead of the usual hard data") in order-to respond to the educational value of 
a variety of experiences 

Documents • * 

Marcy Open School, 1973-1974 Goal Evaluation Minneapolis Southeast Alternatives, Aug 
1974 $1 A formative evaluation report by the schojol documentor, Ruth Anne Aldnch,, 
contains description of school„/ationale for evaluation design and various kinds of data 
on goals-related program 

Final Report The St Paul Open School St Paul, MN ,1973 An outside professional report 
of one.scbool year (72-73) based on a variety of instruments (cognitive and affective), 
observations and interviews Conclusions and recommendations set apart from findings 



1) 



66 



Why are traditional evaluation procedures inadequate for educational program^ 
concerned with process, content and context"? This timely and sfgnify 
publication outlinew^e^w frame of reference for meaningful evaluation 



Cortte 




ihjwt 



PART I Overview 

4 Introduction: A Time for Rethinking 

Vito Per r one r 

7 Alternative Ways in Educational Evaluation 

Anne M Buss/y Edward A Chittenden, and Marianne Amarel 

PART II Testing Problems and Possibilities ^ ' 

13 What Tes^ Do and Don't Do 

Susan Sip^rman Snodolsky 

18 Understanding the Gobble«ly-gook: A People's Guide to Standardized 
Results and Statistics 

Michael Quint) Patton 
27 Standardized Testing: Reform Is Not Enough! 

George E Hem n 

32 Another Look at What's Wrong with Reading Tests { 

Deborah Meier 

37 The Stranglehold of Norms on the Individual Child 

Lois Barclay Murphy * * 

PART III Some Examples of Meaningful Evaluation 

43 The Prospect School: Taking Account of .Process 

Patricia F Carmi 

49 Marcy Open School: Feeding Back to Decision-Makers 

ftuth Anne Aldrich 

52 Children's Interviews 

Nancy Ann Miller 

Reflection tn Teaching . 

Anne.M Bussis and Edward A Chittenden 

Selected Bibliography 



